Ea @ lt IZ 


—_ eS Ee: 


for 
RATIONAL MECHANICS 
ana 
ANALYSIS 
Edited by is 


CG. TRUESDELL 


Volume 4, Number 2 3 


SPRINGER-VERLAG 
BERLIN-GOTTINGEN- HEIDELBERG 
(Postverlagsort Berlin «1.12. 1959) 


Mechamcam vero duplicem Veteres constituerunt: Rationalem quae he De- : 


monstrationes accurate procedit, & Practicam. Ad practicam spectant Artes omnes 
Manuales, a quibus utique Mechanica nomen mutuata est. Cum autem Artifices 
parum accurate operari soleant, fit ut Mechanica omnis a Geometria ta distinguatur, 
ut quicquid accuratum sit ad Geometriam referatur, quicquid minus accuratum ad 
Mechanicam. Attamen errores non sunt Artis sed Artificum. Qui minus accurate 
operatur, imperfectior est Mechanicus, & si quis accuratissime operart posset, hic 


foret Mechanicus omnium perfectissimus. NEWTON 


La généralité que j’embrasse, au lieu d’éblouir nos lumieres, nous découvrira 
plutét les véritables loix de la Nature dans tout leur éclat, & on y trowvera des raisons 


encore plus fortes, d’en admirer la beauté & la simplicite. EULER 


Ceux qui aiment l’Analyse, verront avec plaisiy la Méchanique en devenir une 


nouvelle branche... LAGRANGE 


The ARCHIVE FOR RATIONAL MECHANICS AND ANALYSIS nourishes 


the discipline of mechanics as a deductive, mathematical science in the classical 
tradition and promotes pure analysis, particularly in contexts of application. Its 
purpose is to give rapid and full publication to researches of exceptional moment, 
depth, and permanence. 


Each memoir must meet a standard of rigor set by the best work in its field. 
Contributions must consist largely in original research; on occasion, an expository paper 
may be invited. 


English, French, German, Italian, and Latin are the languages of the Archive. 
Authors are urged to write clearly and well, avoiding an excessively condensed or 
crabbed style. 


Manuscripts intended for the Archive should be submitted to an appropriate 
member of the Editorial Board. 


The ARCHIVE FOR RATIONAL MECHANICS AND ANALYSIS appears in 
numbers struck off as the material reaches the press; five numbers constitute a 


volume. Subscriptions may be entered through any agent. The price is DM 96.— 
per volume. 


Notice is hereby given that for all articles published exclusive rights in all 
languages and countries rest with Springer-Verlag. Without express permission of 
Springer-Verlag, no reproduction of any kind is allowed. 


For each paper 75 offprints are provided free of charge. 


s I fin t 


On the Thermostatics of Continuous Media 


BERNARD D. COLEMAN & WALTER NOLL 


Contents Page 

ipeechamcalsprelimimalies™ ra-sete.c scat ede. eigus ~ a SUSE IDS, tv) oue prey igure (4 
Paeslhermomechamicistatesmer aii: fot. tre nte coetipoas TAF Gitaee Slt a kew ead ae Oe 
Bebb ercal oniciequation,Olsstatementwnr tr icy bic wich ae tet be debe bos nti ke ceareeets 103 
Ae CRSOLLOD YE CLOUD weiter eee aa cere lhe bee a! oP dla SmetOS 
DRPLIOLC CSS LLCSSCS ATIC WOLKe auhie hl 6 bul: a aaae Roe ah) ah en suc ah coe St ee Od 
Cre D erinttionsolsthermaleeq ultra ee eenn het gee nue ain eee HOY 
Fo Cooretaniotornestercidavacenayl Cepelblexetbian, os Algo yy bd oo eben oo mg OV, 
SHE tundamentalypostulatesigus, skechacatees ght aceite aces <a sles palsi cee a aeRO 
9. An alternative axiomatization ... Aree Ces ies aps eae ror 4t5 IG) 
10. Infinitesimal deformations from an arbitrary state te rr Tae aa | 
Milo Sebhaay ey Cesta lity ale ie eee ees Wir deere pearl Rees ka be MR rien Ae iy ha erie tases sean 141g) 
i2erlsorropiczmaterialsy-e wets aye oe hy ES) Ah Nhe cores OMS 
ARAM Te CHeN CLO Ae Ue een Attn thier TeMntercr gph gli «Sieh ae .8 wibhs gs agi sedan oumeedyy meclili7, 
“14h, IRlevera nae asl omMbheie “Se gine os camOr ee Lot Chee nm Oh on eer ror Gimme cy ce geet ty eae ss) 
15. Mechanical stability . . . Ane ay 5 IE, Ba ee a a RL at sy Pate cee ON 
16. Gibbs’ thermostatics of dst Ab ae ts a ESR AS ee La Ye TNE IG 
RRGLERETA CES twat 6 TORT Ses ee Fae se aa eg Hes ae a ea dete a LUPO ce Sef RR BI 

Introduction 


In this article we regard thermostatics as being that branch of thermo- 
dynamics which deals with bodies which are at rest at the present time and 
which, for all practical purposes, may be regarded as having been at rest at 
all times in the past. 

We attempt to develop here a rigorous theory of thermostatics for continuous 
bodies in arbitrary states of strain. The thermodynamics of chemical reactions, 
phase transitions, and capillarity is not discussed. Our aim is to derive some 
of the fundamental laws of hydrostatics and elastostatics from thermodynamic 
principles. Among these laws are the existence of elastic potentials for stress- 
strain relations, the known inequalities of hydrostatics, and some new inequalities 
for hydrostatics and elastostatics. 

In his classic work, “On the Equilibrium of Heterogeneous Substances’, 
J. W. Grss [7] laid down criteria for determining whether a given (global) state 
of a body is thermodynamically stable. He used these criteria to derive particular 
equations and inequalities which represent conditions (in some cases necessary 


Arch, Rational Mech. Anal., Vol. 4 7 


98 rv BERNARD D. COLEMAN & WALTER NOLL: 


and other cases sufficient) for various special states to be stable. The equations 
Gipps obtained as necessary conditions for thermodynamic stability are now 
recognized as fundamental laws in physical chemistry. GIBBS also derived 
inequalities which, apparently because they are in obvious accord with everyday 
experience and thus might be mistakenly called trivial, have attracted relatively 
little attention and are sometimes not even mentioned in modern thermodynamics 
courses. For example, in his treatment of homogeneous systems at rest under 
uniform hydrostatic pressure, GIBBS showed that a necessary condition for such 
a system to be in a stable state is that both its heat capacity at constant volume 
and its adiabatic modulus of compression be non-negative. It is inequalities of 
this type which are emphasized in the present paper. We take, however, a point 
of view different from that of GIBBs. 

In the classical treatments of thermostatics (e.g., [1], [2]) the adjective stable 
is used in two senses. It is sometimes used as a modifier for the word equilibrium; 
7.e. one refers to “states of stable equilibrium”’; or it is used as a modifier for the 
word state; i.e. one refers to ‘‘stable states’. In this paper we never use the word 
stable in the former sense. The theory which we develop here makes a careful 
distinction between local states, referring to a material point in a body, and global 
states, referring to the body as a whole. A local thermomechanic state is specified 
by giving the entropy density and the local configuration at a material point. 
A global thermomechanic state, on the other hand, is specified only when the 
entropy field and the complete configuration are specified for the entire body. 
We regard thermal equilibrium to be a property of local states. We consider just 
one type of thermal equilibrium. We define a state of thermal equilibrium as a 
local thermomechanic state which minimizes an appropriate potential rather than 
as a state at which a first variation vanishes. We regard stability as a property 
of only global states. We consider several types of stable states, defined as global 
thermomechanic states which minimize certain energy integrals subject to dif- 
ferent constraints. 

Our theory is based on two physical postulates. The first asserts that, at a 
material point, any local thermomechanic state can be an equilibrium state 
provided the local temperature and local forces have appropriate values. The 
second postulate is essentially the assumption that, at least in continuum 
mechanics, absolute temperatures are never negative. We believe that these 
physical postulates, which are stated in terms of our definition of equilibrium, 
contain the physical content for the statics of continuous media of the First and 
Second Laws of Thermodynamics. From our postulates we prove relationships 
between the stress-strain equation and the caloric equation of state, and we 
derive various inequalities restricting the form of the caloric equation of state. 
We should like to propose that the inequalities which we obtain for the finite 
theory of elasticity answer some of the questions raised by C. TRUESDELL [3] in 
his recent article, “Das ungeléste Hauptproblem der endlichen Elastizitats- 
theorie<: 

Although our definition of thermal equilibrium is new, some of the definitions 
of the stability of global states which we propose tor study are similar to stability 
definitions considered by Gress [/] and J. Hapamarp [4]. In particular, our 
concepts of isothermal and adiabatic stability at fixed boundary are closely related 


Thermostatics of Continua i 99 


to, but not identical to, Hadamard stability*. We briefly discuss Grpss’ theory 
of the stability of fluid phases in §16. In a future article we hope to give a 
discussion of GipBs’ theory of the stability of fluid mixtures. 


We regard the main tasks of the science of thermostatics to be, first, the 
exploration of the consequences for the caloric equation of state of the existence 
of local states of thermal equilibrium and, second, the derivation of useful 
necessary and sufficient criteria for global states to be stable. In the present 
paper, §6—§ 13 are devoted to the first task and § 14—§16 deal briefly with 
the second. From our present point of view, we should say that the great classical 
thermodynamicists, GipBs and DuHEm, devoted their main efforts to the second 
task. 

It will be noticed that in this paper we never mention such notions as ‘“‘re- 
versible processes’’ and “‘quasi-static processes”; in fact, our theory of thermo- 
statics, being a truly statical theory, has no need of “‘processes”’ at all. 


In writing the present paper we have striven for a level of mathematical rigor 
comparable to that cf works in pure mathematics rather than to that customary 
in physics. 


Notation and basic mathematical concepts. We often find it convenient to 
distinguish between functions and their values. The basic local thermodynamic 
variables are denoted by light face Greek minuscules: ¢«, y,7,%,.... Symbols 
such as @,€,é... and®,#, P... represent real valued functions whose values are 
the theimodynamic variables ¢ and y. 


We denote vectors and points of the three-dimensional Euclidean space & 
by bold face Latin minuscules: v, #7, y.... 


Second order tensors are denoted by light face Latin majuscules: F, U, Q, R, I. 
However, we reserve the symbols X and Z to represent material points of a 
physical body. The term tensor is used as a synonym for linear transformation. 
Tensors of order higher than two do not occur in this paper. For the trace of 
the tensor F we write tr F and for the determinant of / we write det ’. We 
say that F is invertible if & has an inverse f+; 1.e. if det #+=-0. The transpose 
of F is denoted by F’. The identity transformation is written J. For the com- 
position, or product, of two linear transformations A and 6 we write simply AB. 


* Hadamard stability requires (roughly) that the first variation of the integral 
of the elastic potential vanish, and that the second variation be non-negative, for all 
smooth variations in the state of strain which are compatible with a fixed boundary. 
This sort of stability is necessary but not sufficient for stability at fixed boundary 
as we define it here. In the theory of the propagation of waves in a perfectly elastic 
solid, Hadamard stability of a particular rest state implies the reality of all roots 
of the wave velocity equation for acceleration waves of arbitrary direction which 
might impinge on an object in that state. J. L. EricKsen & R. A. Toupin [5] have 
recently considered a modification of Hadamard stability in which they require that 
the second variation of the integral of the elastic potential be strictly positive. They 
use their definition of stability to prove uniqueness theorems in the theory of small 
deformations superimposed on large. R. Hitt [/4] also has recently discussed rela- 
tionships between uniqueness and stability. In the third article of his ‘Recherches 
sur l’élasticité’’? P. DunHEmM [6] formulated several definitions of stability which are 
applicable to bodies with fixed and partially free surfaces; he also derived several 
necessary conditions on the equation of state for particular states of strain to be stable. 


hi 


100 BERNARD D. COLEMAN & WALTER NOLL: 


Let h(a) be a function for which both the range and the domain consist of 
either vectors or points in Euclidean space &. Assume that for x in some region 


the derivative 
d 


ay h(x +s) =Vh(a«;v) (1) 


s=0 

exists for all v and is continuous in x. It is the content of a fundamental theorem 
of analysis that Vh(a; v) is then a linear function of v, and hence we can write 
Vh(x;v) =[Vh(x) |v, (2) 


where Vh() is a linear transformation (tensor), called the gradient of h at x. 


Similarly, the gradient of a real valued function ¢(F) of a tensor variable F 
is a tensor valued function ¢,(F) defined by the relation 


SbF +84)|_ = tr[e(F) A], (3) 


c= 


where A is an arbitrary tensor. If Cartesian coordinates are used, and if ||f;,|| 
is the matrix of F, then the matrix of ¢; is given by 


te Pl=|e-], 


where 7 is the row and 7 the column index. 


We make frequent use of the following theorem, called the polar decomposition 
theorem: Any invertible tensor F has unique decompositions 


F=RU=VR (4) 


where R is orthogonal (i.e., RR’=I) and U,V are positive definite symmetric 
tensors (1.e., U=U"', V=V", and the proper numbers of U and V are all real 
and greater than zero). In addition, we have 


VU=R VRS SUA= he SV See (5) 


Consider a smooth (7.e., continuously differentiable) real valued function 
¢(w) whose domain W is a region in a finite dimensional vector space. The 
function ¢ is called strictly convex if either of the following two equivalent condi- 
tions are satisfied: 

(a) For all w, and w,-+-w, in W and all positive «, 6 with «+f =41, the 
inequality 

C(% wy + B Wy) <al(w,) + BC (ez) (6) 
holds. 


(b) For all w and w*=+w in W the inequality 
C(w*) —C(w) — (w* — w) - VE (w) > 0 (7) 
is satisfied. 


When Y is a region in the space of all tensors, we use the notation of (3) 
and the convexity inequality (7) becomes 


C(F*) —¢(F) —tr[(F* —F)o,(F)] >0. (8) 


Thermostatics of Continua 101 


For a twice continuously differentiable function ¢ (w) to be strictly convex in W, 
it is sufficient that the second gradient VVf(w) be positive definite for w inW. 
This condition is not necessary, however: if €(w) is convex, it follows only that 
VVC(w) is positive semidefinite. 


1. Mechanical preliminaries 

We give a brief summary of those concepts from the mechanics of continuous 
media that are relevant to the present investigation. For a detailed discussion 
we refer to [7] and [8]. 

A body & is a smooth manifold of elements X, Z,..., called material points *. 
A configuration f of B is a smooth one-to-one mapping of # onto a region in a 
three-dimensional Euclidean point space &. The point #=f(X) is the position 
of the material point X in the configuration f. The mass distribution m of B is 
a measure defined on all Borel subsets of &. For the total mass of # we write 
m(B). To each configuration f of Z corresponds a mass density 0. 

Consider a neighborhood (X) of a material point in a body; 1.e., a part 
of the body containing X in its interior. Let g be a smooth homeomorphism 
of W(X) into the three-dimensional vector space Y such that X itself is mapped 


into the zero vector 0. The inverse mapping of g is denoted by g- Let g, and g, 
be two such homeomorphisms. The composition g,o g; of gy and gj is defined by 


(92091) (@) = go (9i(@)). 
It is a mapping of a neighborhood of 0 onto another neighborhood of 0. We 
define an equivalence relation ““~’’ among all these homeomorphisms by the 
condition that g,~g, if the gradient of the mapping 92° Gi at 0 is the identity J. 
The resulting equivalence classes will be called the local configurations ** M of X. 
If M, is the equivalence class of g, and M, the equivalence class of g, then the 


gradient at 0 of 9x0 Gu, ine. 


G=V(gz° 92) (0), (1.1) 
depends only on M, and M,. We write 
G=M,M,", M,=GM,, (1.2) 


and call G the deformation gradient from M, to M,; G is an invertible linear 


transformation. 
It is often convenient to employ a local reference configuration M, and to 


characterize the other local configurations 
M=FM, (1.3) 


by their deformation gradients F from the local reference configuration M,. If, 
in this way, two local configurations M, and M, correspond, respectively, to & 
and F, then the deformation gradient G from M, to M, is given by 


G=hE* R=GCk. (1.4) 


x The term “‘particle’”’ is often used. We prefer ‘‘material point” to avoid confusion 
with molecules and other physical particles. 
*x The term “‘configuration gradient’’ was used in [7]. 


102 BERNARD D. COLEMAN & WALTER NOLL: 


The rotation tensor R, the right stretch tensor* U, and the left stretch tensor V 
of a deformation gradient F are defined by the unique polar decompositions 


PLRULVR, (1.5) 


where R is orthogonal, while U and V=RUR’ are symmetric and positive 
definite. We note that U and V have the same proper numbers; these proper 
numbers are called the principal stretches v,, V2, v3. A deformation gradient G 
is called a pure stretch if its rotation tensor reduces to the identity J; 7.e., if G 
is symmetric and positive definite and hence coincides with its own right and 
left stretch tensors. 

The mass densities at X corresponding to the local configurations M, and M, 
are denoted, respectively, by e, and o,. We have 


el 
02 = |detG] ° (1.6) 


where G is related to M, and M, by (1.2). 


2. Thermomechanic states 


A global thermomechanic state, or simply a state, of a body Z is a pair {f, 7} 
consisting of a configuration f of # and a scalar field 7 defined on #; 7 is called 
the entropy distribution of the state. 

A local thermomechanic state, or simply a local state, of a material point X 
is defined as a pair (M, 7) consisting of a local configuration M of X and a real 
number 7, called the entropy density (per unit mass) of the local state**. 

In the following we often use a local reference configuration M, and, according 
to (1.3), characterize the other local configurations M by the deformation gradients 
I from M,. We then use the pair (F’, 7) to characterize the local states. 

Two local states (f,7) and (f’, 7’) will be called equivalent if they differ 
only by a change of frame of reference. The local configuration transforms under 
a change of frame according to the law F’=QF where Q is orthogonal. We 
assume that the entropy density 7 is objective; 7.e., it remains invariant under a 
change of frame. Thus, the local states (/’, 7) and (F’, 7’) are equivalent if and 


only if 
y FiO Rosine (2.1) 
for some orthogonal Q. 
We say that two global states {f, 7} and {/’, 7’} are equivalent if they differ 
by only a change of frame. This is the case if and only if 
n'(X) =(X), F(X) = QF(X) (2.2) 


for all X in the body and some orthogonal tensor Q independent of X. Here, 
F(X) and F(X) are the deformation gradients at X corresponding to f and f’ 
respectively. 


* The term “‘strain tensor’’ was used in [7]. 

** In this article, pairs in braces, { }, always refer to global properties; the elements 
of such pairs are fields over . On the other hand, pairs in brackets, (), always refer 
to local properties and have elements which are either real numbers or tensors. Note 
that the symbol 7 in {f, 7} and (M, 7) denotes different entities; in the first case q 


denotes a field while in the second case it denotes a number. No confusion should 
arise, however. 


Thermostatics of Continua 103 


3. The caloric equation of state 
A material is characterized by a real valued function of local states, whose 
values ¢ are called the energy densities (per unit mass) of the local states. We 
pick a fixed local reference configuration M, and characterize the state (M, 7) 
by the pair (F, 7) where F =MM,". We write 


e=3(F,7). 6.1) 


It is assumed here that the function é has continuous derivatives with respect 
to F and 7%. 

We assume that the energy density is objective; 7.e. invariant under a change 
of frame. It follows from (2.1) that the function é must satisfy the relation 


é(QF, n) = €(F, 7) (3.2) 
for all orthogonal Q. Using the polar decomposition (1.5) and putting Q=R* 
in (3.2) we see that a ke 

é=&(F,n) = €(U,n); (3-3) 
7.e., that the energy density is determined by the right stretch tensor U and the 


entropy 7. 
The function é€ in (3.3) depends on the choice of the local reference configura- 


tion M,. The function é’ corresponding to some other local reference configura- 
tion M, is related to é by iy fs 
e(f,n) = €(FG,7), (3.4) 


where G = MM," is the deformation gradient from M, to My. 


The equation (3.3) characterizes the thermal and mechanical properties of a 
material in statics. It is called the caloric equation of state of the material. 


4. The isotropy group 
It may happen that the energy function € remains the same function if the 
local reference configuration M, is changed to another local reference configura- 
tion M; =H M, with the same density. It follows from (3.4) that € then satisfies 


the relation a ng 
é(P 4) = e(F 7). (4.1) 


Since M, and M, have the same density, it is clear from (1.6) that |det H| =1; 
1.e., H is a unimodular transformation. The unimodular transformations H for 
which (4.1) holds form a group, called the tsotropy group F of é or of the material 
defined by €. This group depends, in general, on the choice of the local reference 
configuration, but it can be shown that the groups corresponding to two dif- 
ferent local configurations are always conjugate and hence isomorphic. 

We say that the energy function é defines a simple fluid if its isotropy group 
G is the full unimodular group ¥%. If ¥=W for one reference configuration, 
then Y =Y for all reference configurations. A material point is called a fluid 
material point if its energy function defines a simple fluid. The caloric equation 


* For the application to physical situations it is necessary to limit the domain 
of é to a region in the space of local configurations and an interval on the y-axis. We 
do not supply the mathematical details which may arise in the consideration of limi- 
tations of this kind. 


104 BERNARD D. COLEMAN & WALTER NOLL: 


of state (3.3) then reduces to the form 

é = &(F,n) =€(v,m), (4.2) 

h 
ye jax Me ulidene |e (4.3) 
2 r 

is the specific volume of the local configuration M =F M,; @ and g, are the mass 
densities corresponding to M and M,. The function @ in (4.2) does not depend 
on the choice of the reference configuration. 

We say that a material point is an isotropic material point if the isotropy 
group of its energy function @, relative to some local reference configuration, 
contains the orthogonal group @. Those local reference configurations of the 
material point for which Y contains @ are said to be undistorted. A simple fluid 
is isotropic, and all of its local configurations are undistorted. For any isotropic 
material, it follows from (3.2) and (4.1) that é satisfies the relation 


(QU Q?,4) =8(U,n) (4.4) 


for all symmetric and positive definite U and all orthogonal Q, provided the local 
reference configuration for @ is undistorted. Taking Q=R, so that V=RUR?* 
is the left stretch tensor, we see that for isotropic material points the caloric 
equation of state (3.3) may be written in the form 


ee OP Fi am) - (4.5) 


It is a further consequence of (4.4) that for each fixed value of 7, € may be 
expressed as a symmetric function of the three principal stretches v,, v2, vs: 


e=&(F,y) =8(V, 1) = €(%, V2, 03; 4) = E(0;,7). (4.6) 

It may also be expressed as a function of the three principal invariants Jy, II,, 
Illy of V and U: $ = 

‘ 6é=8(V,y) =8 (Ip, 1ly,11lyin). (4.7) 


We say that the energy function é defines a simple solid if its isotropy group 
Y is contained as a subgroup in the orthogonal group ©. A material point is 
called a solid material point if its energy function @, relative to some local con- 
figuration as a reference, defines a simple solid. The local reference configurations 
with this property are again called the undistorted states of the solid material 
point. For an tsotropic simple solid, the isotropy group Y is identical to the 
orthogonal group 0. 

Throughout the rest of this paper, whenever we discuss isotropic materials 
it is to be understood that the local reference configuration for the energy density 
function is undistorted, unless the reference configuration is explicitly specified. 


5. Forces, stresses, and work 


A system of forces is a system of vector valued measures, one for each part 
of the body # under consideration*. One must distinguish between contact and 
body forces. The contact force acting across an oriented surface element in G 
will be denoted by de. 


* For a detailed axiomatic treatment of. VS. 


Thermostatics of Continua 105 


Definition of mechanical equilibrium. In order that a body B be in mechani- 
cal equilibrium under a given system of forces, two conditions must be fulfilled 
for each part P of B: (a) the sum of the forces acting on P must vanish, and (b) 
the sum of the moments, about any point, of the forces acting on P must vanish. 


The condition (a), called the force condition, depends only on the body and 
the force system, not on the configuration of the body. The condition (b), called 
the moment condition, does depend on the configuration of the body; 7.e., for a 
given force system, the moment condition may be satisfied for one configuration 
but not for another. 


The force condition alone implies that, for each configuration, the contact 
forces de arise from a stress-tensor S, so that 


de=SndA, (54) 


where m is the unit normal vector of the oriented surface element and dA its 
area in the configuration under consideration. For fixed contact forces de, the 
stress tensor S will be different for different configurations. 


We consider now a neighborhood W(X) of a material point X and assume 
that a system of contact forces de is given for W(X). Let f, be a fixed reference 
configuration and f some other configuration of W(X). If de is such that the force 
condition is satisfied, then (5.1) is valid for all configurations; we can write for 
the reference configuration f,, in particular, 


de=S,njaA,, (5.2) 


where n, is the unit normal of the oriented surface element in the reference 
configuration f,, and dA, is the area of the surface element in f,. We denote 
the position vector, in the configuration f, of a typical material point Z in W(X), 
relative to the position of X as origin, by p, and we consider the tensor K defined 
by 4 
K=— fy | pede, 5. 
vtrixy JF (5.3) 


NX) 


where W(X) denotes the boundary surface of W(X) and v(W(X)) the volume 
of W(X) in the configuration f, and where ® denotes a tensor product. If the 
force condition is satisfied, the relation (5.1) is valid, and we have 


FL 1 
ea oe [ Snepaa. 


0 ( 
N(X) 
In the limit as /(X) shrinks to X, we obtain, after using Green’s theorem, 
Soe lini Le (5.4) 
MN (X)>X 


The same argument, with the a eae replaced by the reference configura- 
tion f,, gives 
We om ee (5.5) 


MN (X)>X Vy (VM 


where v,(4(X)) is the volume of W(X) in Fe reference configuration and p, 
the position vector, in the reference configuration, of a typical material point Z 


106 BERNARD D. COLEMAN & WALTER NOLL: 
in W(X), relative to the position of X as origin. The position vector p of Z in 
the configuration f is related to p, by the relation 

p=Fp, + 0(|P,) (5.6) 
where F is the gradient at X of the deformation from f, to f and where 


d) 
lim 2 
nay d 


Substitution of (5.6) into (5.3) and use of (5.4) and (5.5) yields 


Sains: (5.7) 
Qr 
where 9 and 9g, are, respectively, the mass densities at X in the configurations f 
and f,. 

The skew part of K, defined by (5.3), is the moment about X, per unit volume, 
of the contact forces de acting on W(X) in the configuration f. If the moment 
condition is satisfied for the configuration f, then the total moment (7.e. the 
moment of the contact forces and the body forces) about X in f must vanish. 
Since the moment per unit volume about X of the body forces on W(X) goes to 
zero as W(X) shrinks to X, it follows from (5.4) that S must be symmetric if 
the moment condition is satisfied in f. 

We say that a material point X is in local mechanical equilibrium, when the 
body is in a given configuration and under a given force system, if the stress 
tensor S exists at X and is symmetric. 

The local behavior at X of a system of contact forces is completely determined 
by the tensor S, defined by (5.2). It is called the Kirchhoff tensor* of the system. 
For a given force system, the Kirchhoff tensor depends only on the choice of 
the reference configuration and remains the same if the actual configuration is 
changed. From (5.7) we see that the existence of the Kirchhoff tensor S, and 
the symmetry of /’S, are necessary and sufficient conditions for local mechanical 
equilibrium at a material point in the local configuration determined by F. 

In order that a body # in a configuration f be in mechanical equilibrium, 
it is not sufficient that all its material points be in local mechanical equilibrium ; 
v.e., that the stress tensor exist and be symmetric at each material point. Global 
mechanical equilibrium will prevail only if, in addition, Cauchy’s law 


divS+ob=0 (5.8) 


is satisfied. In this equation, S, @, and the density b of the body forces are to 
be regarded as fields with domain f(¥). 

We consider now a smooth one-parameter family of configurations f(s) with 
deformation gradients F(s) at X. The work per unit mass done on MN (X) by the 


contact forces de along the path of configurations f(s) from s=s, to s=s, is 
defined by 


re 1 “dp 
= reer [Balas me 


* Cf. TRUESDELL [9], (26.5). 


Thermostatics of Continua 107 


where ™m (4W(X)) is the mass of W(X) and p(s) denotes the position vector, in 
the configuration f(s), of a typical material point in W(X). Assuming that the 
contact forces de are independent of s, we obtain 


: 1 
Ora ar, _ | Pls) de — J p(s) -de (5.10) 
W(X) TK) 
Observing (5.3), (5.4) and (5.7), and taking the limit (X)—X, we get 
p,ie= tr (s,)\S,| — tr LF (s,).S, |. (5.11) 


This relation shows that ers tr(# S,) has the physical meaning of the potential 


energy, per unit mass, of the local contact forces. 


6. Definition of thermal equilibrium 


A force temperature pair for a material point X is a pair (S,, #) consisting 
of a tensor S,, to be interpreted as the Kirchhoff tensor of a system of contact 
forces at X, and a real number #, to be interpreted as the temperature at X. 


Let a force temperature pair (S,,%) be given and consider the function 
4 “ 1 


To help motivate the definition of thermal equilibrium given below, we make 
the following remarks. According to (5.11) the term Ps tr(F'S,) is the potential 
energy, per unit mass, of the local contact forces. The term —n may be inter- 


preted as a thermal potential energy. Thus, the value =A(F, m) gives a kind 
of free energy per unit mass of the local state (/, 7) when under the action of 
the force temperature pair (S,, #). 


Definition of thermal equilibrium. The local state (F',7) ts called a state 
of thermal equilibrium under a given force temperature pair (S,, 0) if 

(a) the stress tensor S =(o/o,)F S, 1s symmetric, 

(b) the inequality 


1(F*, 4*) >A (EF, 7) (6.2) 
holds for all states (F'*,*) += (f,) such that 
Te GPs (6.3) 


where G is symmetric and positive definite. 

The condition (a) means that / corresponds to a local configuration in local 
mechanical equilibrium (cf. § 5). The condition (b) means that a change of state 
increases the free energy 4 provided that the configuration of the changed state 
is related to the original configuration by a pure stretch G (cf. § 1). 


7. Conditions for thermal equilibrium 


In this section we show that, for a local state (Ff, 7) to be a state of thermal 
equilibrium under the force temperature pair (S,, ?), the following three conditions 


108 BERNARD D. COLEMAN & WALTER NOLL: 


are necessary and sufficient: 


(x) The stress tensor S = 2 FS, is given by the stress relation* 


Y 


S= oF é7(F,n). (FAj 
(B) The temperature # is given by the temperature relation 


0=6, (Fy). (7.2) 
(vy) The inequality 


&(F*,4*) —8(F,m) — tr [(B*— F) 2x(F.m)] — (9* = 9) 2,29) 0) M79) 


holds if (F*, y*) + (F, 7) and F* is related to F by F*=GF, where G is positive 
definite and symmetric. 

We assume first that (F, 7) is a state of thermal equilibrium and prove the 
validity of («), (B), and (y). By (6.2) and (6.3), the function 1(GE, n*) of the 
symmetric tensor variable G and the scalar variable 7* has a minimum for G =I 
and 7*=7. By a theorem of calculus, it follows that the derivatives of 1(GF sar) 
with respect to G and 7* must vanish for G =I and 7*=7. If we set the derivative 
of (GF, n*) with respect to 7* equal to zero at 7* 7, we obtain the temperature 
relation (7.2). The gradient of y) (GF, 7*) with respect to G may be computed 
using the formula (3) of the mathematical preliminaries and (6.1); we obtain 
the equation 


tr{[F ée(F, 7) =—FS|4} =0, (7.4) 


which is valid for arbitrary symmetric tensors A. Using (5.7) the equation (7.4) 
may be rewritten in the form 


tr{[oF &(E, 9) — S]4} =0. (7.5) 


By the condition (a) of the definition of thermal equilibrium, S is symmetric. 
It follows from (3.2) and Theorem I of reference [10], p. 42, that oF é,(F, 7) is 
also symmetric. Thus, the tensor of é,(f,7)—S is symmetric. On the other 
hand, (7.5) can be valid for arbitrary symmetric A only if oF é;(F,7)—S is 
skew; whence it follows that ef é;(F,7) —S must vanish, which proves (7.1). 
The inequality (7.3) is obtained simply by substitution of (7.1) and (7.2) into 
the inequality (6.2), after A is replaced by its definition (6.1). 

We assume now that the conditions («), (8), and (y) are satisfied. It then 
follows from (7.1), (3.2) and the theorem of reference [10] mentioned above that 
the stress tensor S must be symmetric, so that condition (a) of the definition 
of thermal equilibrium is satisfied. Furthermore, the Kirchhoff tensor is given by 


Sy = 0 Ep (Fm) « (7.6) 


Substitution of (7.6) and (7.2) into the inequality (7.3) gives the inequality (6.2) 
hence condition (b) of the definition of equilibrium is also satisfied. 


2 


* This is the familiar stress-strain relation of finite elasticity theory (cf. [10], (16.4)). 


Thermostatics of Continua 109 


8. The fundamental postulates 
We are now able to lay down our two fundamental postulates: 
Postulate I. For every local state (F,n) for which &(F,) is defined there 


exists a force temperature pair (S,, 0) such that (F, ») is a state of thermal equilibrium 
under (S,, 2). 


Postulate II. The energy function &(F,n) ts strictly increasing in y for each 
fixed F. 
Postulate I and the results of the previous section yield the following theorems: 


Theorem 1. The force temperature pair (S,, 0) which makes the local state 
(f°, 4) a state of thermal equilibrium is given by 


S, = 0 Er(F, 7), (8.1) 
0 =3,(F,7). (8.2) 


Theorem 2. The energy function € obeys the inequality 
8(F*, n*) — 8(F, 9) — tr[(F*— F) ée(F,9)] — (*— 0) 8, (Fn) >0 (8.3) 


for any two local states (F,n) and (F*, 7*), in the domain of definition of &, which 
are related by 
PEG hs (8.4) 


where G 1s symmetric and positive definite. 

The discussion of the previous section shows that Theorem 2 is equivalent to 
Postulate I. In fact, if we are given a state (f, 7), we can define a force tem- 
perature pair (S,, #) according to (8.1) and (8.2) and then use the inequality 
(8.3) to prove that (F, 7) is in equilibrium under (S,, #). 

The inequality (8.3) of Theorem 2 is a restricted convexity condition on the 
function é. If we take, in particular, #*=/*, then (8.3) reduces to 


é(F,*) — é(F,n) — (n* —) €, (Fn) > 0 (8.5) 


for n*=-7. This inequality is the content of the following corollary to Theorem 2: 
Theorem 3. For each fixed local configuration, the energy density 1s given by 
a strictly convex function of the entropy density. 
This theorem is equivalent to the statement that é,(F,7) must be a strictly 
increasing function of 7 for each fixed F. It follows that the equation (8.2) can 
be solved for 7 in a unique manner: 


1 =) (F,9). (8.6) 


Here, # is a strictly increasing function®* of # for each F. The fact that (8.6) 
is obtained by solving (8.2) for 7 is expressed by the identity 


é, [F,9(F, 0)] =. (8.7) 


* The specific heat c at fixed strain is given by c=@ijg(F, 0). Hence, it is a 
consequence of Theorem 3 that c/® is never negative and, for each F, is strictly 
positive except possibly for a nowhere dense set of values of #. 


110 BERNARD D. COLEMAN & WALTER NOLL: 


If we take 7* =7 in (8.3), we obtain 
é(F*, 71) — €(F, 9) — tr (P*— FP) es F, ni > 0; (8.8) 


this inequality holds whenever F*=GF’, where G=+-I is symmetric and positive 
definite. 

A local state (F, 7) is called a natural state if the corresponding stress (8.1) 
vanishes. Keeping the entropy fixed, we may use the local configuration of the 


natural state as the reference configuration, so that F =J and ép(/, 7) = 75 =0. 
In this case, the inequality (8.8), by (8.4), reduces to 


é(G,n) > é(I,n), (8.9) 


which is valid for arbitrary symmetric and positive definite G=-J. Replacing 
G by the right stretch tensor U of an arbitrary deformation gradient l’ and 


using (3.3), we see that Fen. (8.10) 


this expression becomes an equality only when F is orthogonal; 7.e., when (F, 7) 
is equivalent to (I,7). Hence, the energy density 1s smallest in a natural state. 
It should be pointed out that this observation, though important for the theory 
of simple solids, is vacuous for fluids. For, we shall prove in § 11 that the stress 
on a fluid material point in thermal equilibrium is always a strictly positive 
pressure; thus, for a fluid there is no natural state. 

We note that the restriction (8.4) on the inequality (8.3) of Theorem 2 is 
essential for application of the present theory to physical situations. This 
restriction means that the local configurations corresponding to #* and F must 
be related by a pure stretch. If, for example, these local configurations were 
related by a rotation so that F*=QF, with Q an orthogonal transformation, 
then the left side of (8.8) would reduce to tr[(Q—J)Fé,(F, )]|, since é(F*) 
would equal é(f) by (3.2). The stress relation (7.1) shows that the left side of 
(8.8) would then become 7 tr[(Q—J)S]. One can show that this expression 
can be made negative by an appropriate choice of Q if S has at least one negative 
proper number. Thus, the inequality (8.8), were it to hold for arbitrary pairs 


I, F*, would exclude the possibility of thermal equilibrium under compression 
stresses, which is certainly not in accord with experience *. 


9. An alternative axiomatization 
In this section we hope to make clear our reasons for assuming Postulate II 
and to motivate further our definition of equilibrium. 
It follows from Postulate II that, for each fixed F, the caloric equation of 
state has a unique solution for 7: 
n= (Te); (9.1) 


and that the function n is strictly increasing in ¢ for each I’. This one-to-one 
correspondence between ¢ and 4 at each F makes it possible to give an alternative 


* It has also been pointed out by Hitt [75] that an assumption of unrestricted 


convexity of € in the deformation gradient would lead to unacceptable physical 
behavior. 


Thermostatics of Continua 444 


axiomatization of our present theory of thermostatics by taking ¢« and F as 
independent variables and defining thermal equilibrium in terms of the func- 
tion 7. In such a formulation a local thermomechanic state is characterized by 
a pair (f, e), and thermal equilibrium is defined as follows: 

Alternative definition of thermal equilibrium. The local state (F, «) is called 
a state of thermal equilibrium under the force temperature pair (S,, 0), with 9 +0, if 

(a) the stress tensor S =(o/0,)F S, is symmetric, 

(b) the inequality 


A a i 

aR 2) 5M at) tocoetn( (F*ck) Sas (9.2) 
holds for all states (F*, e*) +=(F, ¢) such that F*=GF, where G is symmetric and 
positive definite. 


Theorem 4. The definition of thermal equilibrium given in § 6 and the alternative 
definition of thermal equilibrium are equivalent (for 3 ==0) if Postulate II is assumed. 


Proof. In §7 we showed that, under the original definition of § 6, in order 
for a state (fF, 7) to be a state of thermal equilibrium for the force temperature 
pair (S,, #) it is necessary that 

o=,(F,n). (9.3) 


By a very similar argument it can be shown, using the alternative definition of 
thermal equilibrium, that in order for the state (F, ¢) to be a state of thermal 


equilibrium it is necessary that 

5 =i (F. 8). (9.4) 
Now, by Postulate II, the functions é and 7 are strictly increasing in 7 and e¢, 
respectively, for fixed /. Hence, # cannot be negative if (S,, #) is to be a force 
temperature pair for some state of thermal equilibrium, regardless of which of 
the two definitions is used. Since we here assume ?-+-0, we have #>0, and 
(9.2) can be multiplied by # and then rearranged to give 


— 97 (F*, e*) — 1 tr(F*S,) + e* > —OF(F, 2) —1tr(FS,) +e. (9.5) 


r Qr 


Noting the relations 


(9-6) 


and (6.1), we see that (9.5) is equivalent to (6.2). The requirement that (Ff, 7) == 
(F*, 7*) and the requirements on G=F*fF'? are the same for (6.2) and (9.5). 
The condition (a) is obviously the same in both definitions; hence the definitions 
are equivalent, q.e.d. 

From a certain point of view the alternative definition of thermal equilibrium 
given in this section is more fundamental than the original definition of § 6. 
The alternative definition is more closely related to the physical notion that, 
since entropy tends to increase, equilibrium states should be, in some sense, 
states of maximum entropy. The definition of § 6 is closely related to the idea, 
which is often used in mechanics, that equilibrium states should be, in some 


AAD BERNARD D. COLEMAN & WALTER NOLL: 


sense, states of minimum potential. It should be emphasized that the two 
definitions are equivalent only if Postulate II is assumed; 1.e., only if states of 
negative temperature are excluded. Of course, negative temperatures never 
occur in continuum mechanics, but there are subjects in which they do occur 
(cf. [11], [12]). Statistical mechanical considerations suggest that for systems 
capable of negative temperatures a practical definition of thermal equilibrium 
should be based on the idea of maximum entropy. 


10. Infinitesimal deformations from an arbitrary state 


Here we consider the classical theory of infinitesimal deformations from an 
arbitrary initial configuration. We make no attempt to justify the use of the 
theory of infinitesimal deformations as an approximation to the theory of finite 


deformations. 
In the theory of infinitesimal deformations one considers cases in which 


F*=GF is obtained from F by superimposing an infinitesimal deformation. The 
infinitesimal strain tensor E is defined as the symmetric part of G—TI. 

In the special case in which G is positive definite and symmetric (7.e., when 
F* is related to F by a pure stretch) we have 


E=G—=] (10.1) 
and the excess energy é(GF, 7) — é(F, ) is a function of E alone: 
o(£) = &(GF, n) — é(F,n). (10.2) 


Equation (10.2) is valid approximately even when G is not symmetric. 


In the infinitesimal theory it is assumed (i) that (10.2) is valid exactly for 
all G and (ii) that the excess energy is exactly given by the sum, 


o(E) =0,(E) + 03 (F), (10.3) 


of a term o,(£) linear in EF and a term o,(£) quadratic in E. 


By taking the gradient of (10.2) with respect to E and then putting EF =0, 
it is easily shown that the linear term o,(£) must be given by 


o,(£) = tr[EF é,(F,7)]. (10.4) 
Hence, using the stress relation (7.1), we have 
1 
0, (E) Se (10.5) 
where S is the stress of the original state (F, 7). 
Now, the fundamental inequality (8.8) may be written 


é(GF,n) — 8(F,n) —tr[(G —I) Fés(F,n)]>0. (10.6) 
From (10.1), (40.2) we get 
o(E) —tr[EF é,(F,7)|>0, (10.7) 


and it follows from (10.3), (10.4) and (10.5) that 


o,(E) =o(E) — Ste Shea (10.8) 


Thermostatics of Continua 413 


This inequality is the content of the following theorem: 


Theorem 5. For an infinitesimal deformation superimposed, at fixed entropy, 
on an arbitrary state, the excess energy is the sum of a positive definite quadratic 
form in the infinitesimal strain tensor E of the superimposed strain and a linear 
term + tr (ES), where @ 1s the density and S the stress corresponding to the original 
state. 

If the original state is a natural state in which the stress vanishes, the above 
theorem reduces to the familiar statement that the strain energy is a positive 
definite quadratic form in the infinitesimal strain tensor. For isotropic materials, 
this statement is equivalent to following well known inequalities for the Lamé 
constants: 

#>O0, 3A+2u>0, (10.9) 


which state that the shear modulus and the compression modulus must be positive. 


11. Simple fluids 
For simple fluids we have, by (4.2) and (4.3), 


é(#, n) =e(|det F | v,, 7), (11.4) 
where v,=1/o, is the specific volume in the reference configuration. Taking the 
gradient of (11.1) with respect to F’, we obtain 

én (Fn) = €,(v, ny) v4, (41.2) 
where v=|det F|v,. On substituting (11.2) into the fundamental inequality 
(8.3) and using (8.4), we obtain 

Ew” ENO) ae ON Ui Gtrecel) ca Ak == 17) €.( 0,27) Ong ides) 


which must hold for all positive definite symmetric G =F /'*+ whenever either 


G=+I1 or n= 7*. 
We assume now that v*=v; 7.¢., that |det F*| =|det F 
G is unimodular. We also choose 7*=7. Then (11.3) reduces to 


— vé,(v,y) tr(G —I)>0, (11.4) 


, which means that 


which must be valid for all symmetric positive definite unimodular tensors G + J. 
Let 2,, 22, 3 be the proper numbers of G. We then have 


£; => Ostlengi1828s== 1 (11,5) 


and 
yin(G —1=,+82+8—3- (11.6) 


Using the fact that the arithmetic mean is greater than the geometric mean, 
BtOtO s Teak, (un.7 
we see that (11.5) and (11.6) imply 


ir(G —I)>0. (11.8) 


Arch. Rational Mech. Anal., Vol. 4 8 


414 BERNARD D. COLEMAN & WALTER NOLL: 


Hence, it follows from (11.4) that 
é,(v,4) <0 (11.9) 


for all v and n for which @ is defined. Thus, €(v, 4) must be a strictly decreasing 
function of v for each fixed 7. 
Substitution of (41.2) into (7.1) shows that the stress relation reduces to 
S=—p(v,n) I, (11.10) 
where ~ 
p(v,n) = — é,(v, 9) (11.41) 
is the hydrostatic pressure. By (11.9) it is positive. 
For further exploitation of (11.3) we choose G=aJ, «>0. Since 


det F* 
det F 


e=(/2)r. (11.12) 


Substitution of (11.12) into (11.3) yields the inequality 


u* 


Vv 


a? == |\det @| == 


? 


we have 


B(v*,9*) —E(v,n) —308,(0,7) (| 1) ==) 8, (0.0) > 0, (14-43) 


which must be valid for all v, v*, 4, 7* except, of course, when both v =v* and 
7» =n*. In order to understand the significance of this inequality we introduce 
the new variable 


ea V/ v (11.14) 
and define the function é by 


&(v,y) =8(»,n) =8 (Vv, 7). (11.15) 
A straightforward calculation shows that (11.13) is equivalent to the inequality 
£(0*, 9") € (i 2) — (vy — 9) En l= (n* = H)e, nm) 200; a tae 


which states that €(v, 7) is strictly convex in » and y jointly. If €(v,) happens 
to possess continuous second derivatives, it follows that the matrix 


(11.17) 


must be positive semi-definite. 


We summarize in the following theorem: 


Theorem 6. For a simple fluid in thermal equilibrium, the stress S reduces to 
a hydrostatic pressure S=—(v,n)I. The pressure £(v, 4) =—&,(v, n) is always 
positive. The energy density €(v,) is a strictly convex function of the cube root v 
of the specific volume and the entropy n jointly. 

It is not hard to show that, for simple fluids, the positivity of p(v, 7) and 
the convexity of €(, 7) are not only necessary but also sufficient conditions for 
the validity of the fundamental inequality (11.3). Hence, these conditions are 
also sufficient conditions for the validity of Postulate I for simple fluids. 


Thermostatics of Continua 115 


12. Isotropic materials 
For isotropic materials in general, if we pick an undistorted state as reference, 


we have, by (4.5), 3 Z vf 
) &(i 9) = 8(V k,n) = eV, 7); (12.1) 
where V is the left stretch tensor, defined by the polar decomposition F =V R. 
On computing the gradient of (12.1) with respect to V, we find 
8p (Fm) = RT by (V1). (12.2) 


If we substitute (12.2) into (7.1) and again use F=V R, we see that the stress 
relation may be written in the form 


S=oVép(V,n) =eép(V,n) V. (12.3) 
On substituting (8.4), (12.1) and (12.2) into the fundamental inequality (8.3) 
ing th 
and observing that SFA) GENE SCV Ry GY), 
we obtain 
8 (GV, n*) —8(V,n) — tr[(G — 1) Vay (V,m)] — (y*— 0) 8, (Vi) > 0. (12.4) 
This inequality must be valid for all 7, 7* and all symmetric and positive definite 
G and V, except, of course, when both 7 =7* and G=I. 
We consider now the special case when G and V commute; 7.e., when 
VF¥=CV (42.5) 


is symmetric. In this case the tensors V and V* have an orthonormal basis of 
proper vectors in common. The matrices of V and V’*, relative to this basis, are 


Va OnO a (07-0 
ty eat O cde Onl city |) —= O82.) |; (12.6) 
OigOest)s 00-05; 


where the v; and the v;* are the proper numbers of V and V*, respectively. The 
matrix of éy(V,%) is 


€& 0 0 
Qn (VY. m)|]=]]0 @& OF, (12.7) 
® O ve 
where = a 2 


are the partial derivatives of the function (4.6). Substitution of (12.5), (4.6), 
(12.6), and (12.7) into (12.4) gives the inequality 
E(u; ,7*) — €(u,,7) = DA e.) BAU) iat Deere Wren Opie Ore W429) 


oi 


which is valid except when 7*=7 and v;=v; for all 7. We have thus proved* 


x We have shown that for isotropic materials the inequality (12.9) is a necessary 
condition for validity of the fundamental inequality (8.3). At the present time, it 
is an open matter as to whether (12.9) is sufficient for the validity of (8.3) in the 
isotropic case, or whether further inequalities which are independent of (12.9) can be 
deduced from Postulate I for isotropic materials. 


8* 


116 BERNARD D. CoLEMAN & WALTER NOLL: 


Theorem 7. For an isotropic material, the energy density &(v;, ) 1s a strictly 
convex function of the principal stretches v; and the entropy density » jointly. 
If € happens to be twice continuously differentiable, it follows that the matrix 


14 Er. E13 e1, 
a1 20 53 bo, (12.10) 
€31 €32 €33 €3n 
En1 Ene En3 enn 
is positive semidefinite. Here the indices 1, 2, 3, and 7 denote the derivatives 
of € with respect to v,, v2, v3, and 4, respectively. 
A corollary of the convexity inequality (12.9) is 


Theorem 8. For an isotropic material, the functions €;(v;,), defined by (12.8), 
have the property that v;>v, implies €;(v;, 7) > €, (v;, 0). 

Proof. Without loss of generality, we take 7 =1 and k=2. We then choose 
v¥ =v,, UE =U, 13 =U, and 7*=y7. Since é(v;,7) is a symmetric functionyor 
the principal stretches v, (cf. § 4), and since the vj differ from the v; only by 
their order, we have 

(us ,) = €(v;,7). 
Hence (12.9) reduces to 
— (U2 — 4) &(;,) — (1 — Ve) €g(v;,) > 0; 
ocx, 
(v1 — V2) [€,(v;,) — €2(v,;,)] > 0. 


Thus, if v;>v,, then &,(v;, 4) >€2(v;, 9), q.e.d. 

In an isotropic material, the left stretch tensor V and the stress tensor S 
have an orthonormal basis e; of proper vectors in common. The e, determine 
the principal axes of stress. It follows from (12.3), (12.6), and (12.7) that the 
principal stresses are given by 


S; = 00; €;(v;,). (42.11) 


When measured per unit area in the undistorted reference state, these principal 
stresses must be replaced by 


S; = 0, &;(¥;, 7) - (12.12) 
Hence Theorem 8 has the following simple physical interpretation: 


Theorem 8a. 1/7, at a given value n, the principal stretch v; is greater than the 
principal stretch v,, then the principal stress, measured per unit area in the un- 
distorted reference state, in the direction of v; ts greater than that in the direction of v,. 

It should be noted that the statement of this theorem does not necessarily 
remain valid if the principal stresses are measured per unit area of the deformed 


state*, except when these stresses are all positive; 7.e., except in a state of pure 
tension. 


* Such a statement was proposed as a postulate by M. BaKer & J. L. EricKksEn 


[13]. In our theory, only the modification given by Theorem 8a is valid. Related 
inequalities have been studied by J. Barta [14]. 


Thermostatics of Continua A) 


13. The free energy 


It is often useful to employ the deformation gradient F and the temperature 
J, rather than F and the entropy y, as the independent variables. This is possible 
because, by (8.2) and (8.6), there is a one-to-one correspondence between 7 and # 
for each fixed F. 


The free energy function p is defined by 
pF, 0) = [PF 7(F, 8)] — 079 (F, 9), (13.1) 
where the entropy function 7 is defined in (8.6) as the unique solution of the 


equation (8.2). The values p of the free energy function ® are called free energy 
densities *. 


Differentiation of (13.1) with respect to F, using the chain rule, gives 
br (F, 0) = ep [Fy (PF, 9)) +2, [FF 76, 9) fie (F,9) — 8 tie (F, 8). 


It follows from (8.7) that the last two terms cancel, so that 


Pr (F, 8) = ep [F, H(F, 8) ]. (13.2) 
Differentiation of (13.1) with respect to &# gives 
Po(F, 0) = —H(F, 9). (13.3) 


From Theorems 1 and 2, (13.2), and (13.3) we get 


Theorem 9. Lor the force temperature pair (S,, 0) to make the local state (I, ) 
a state of thermal equilibrium, it 1s necessary and sufficient that S, and y be given by 


S, = 0, Or (F, 9), : (13.4) 
1 = —Gol(F, 9). (13.5) 


On multiplying (13.4) on the left by (o0/0,) # and noting that S =(0/0,)F'S,, 
we get the following form for the stress relation: 


Sof (Fe). (13.6) 
Assuming that two temperatures #, #* and two deformation gradients I’, * 
are given, we now put mi * 
3 : 4 = i (F,9) = — Po(F, 9) 
te lie Us) ace Wg tt (13.7) 


By substituting (13.1) —(13.7) into the fundamental inequality (8.3) of Theorem 2, 
we obtain 


Theorem 10. The free energy function p obeys the inequality 
o(F*, 0*) —p(F, 9) — tr[(F* — F) r(F,) | — (0* — 8) Ba (F*, O*) > 0 (13.8) 
for any two pairs (F, 8) and (F*, 0*) = (F, 8) in the domain of definition of p which 
are related by F*=GF. (13.9) 


where G is symmetric and positive definite. 


* The term ‘“‘Helmholtz free energy per unit mass’ would also be in accord with 
common usage. 


118 BERNARD D. COLEMAN & WALTER NOLL: 


As Theorem 2, so also Theorem 10 is equivalent to Postulate I. 
If in (13.8) we take the special case F* =F and interchange # and #*, we 


obtain (EF, 0*) — O(F, 9) — (O* — 9) Go (F, 9) <0. (13.10) 


This inequality, which is valid for all F and all #*==9, states that the free energy 
function @(F, #) is strictly concave in # for each F. 
Putting 9*=8 in (13.8) gives the following restricted convexity of y in F: 


$(F*, 0) — O(F, 9) — tr[(F*—F) Gp(F, 8)] >0, (13.11) 


the restriction being the condition (13.9). 

The considerations and results of § 11 and § 12 on simple fluids and isotropic 
materials remain valid if the energy function é is replaced by the free energy 
function $, except that the convexity of é(F, 7) in 7 corresponds to the concavity 
of »(F, #) in &. We summarize the relevant results. 

For a simple fluid, the free energy density reduces to a function of. the specific 
volume v and the temperature #@ only: 


yp=PlF, 9) = yr, 9). (13.42) 
The stress reduces to a hydrostatic pressure given by 
S=— pv, 9)I, pv, 8) = —%(r, 9). (13.13) 


The pressure is always positive. The function ~, giving the free energy as a 
function of the cube root v of the specific volume and the temperature, 


Vy, 3) =P (3,0), (13.14) 
satisfies the inequality 


p (v*, O*) — p(y, ®) — (v* — ») 9, (v, 8) — (O* — B) Py (v*, OF) > 0. (43.45) 


This inequality implies that ~(v, #) is strictly convex in y for each #@ and strictly 
concave in # for each ». 

For isotropic materials in general, the free energy reduces to a function of 
* the temperature #? and the three principal stretches v,, v2, v3, computed relative 


to an undistorted state:  ~ _ 
P(E, 8) = P(r, Ve, Ug; 8). (13.16) 


The function # is symmetric and strictly convex in the variables V1, Vg, Vg; P IS 
strictly concave in #. Theorems 8 and 8a remain valid if @ is replaced by y; 
v.e., if the temperature, rather than the entropy is fixed at a given value. The 
stress relation may be written in the form 


S=eVpy(V,9), (13.17) 
where V is the left stretch tensor. 


. The forms, (13.6), (13.13), and (13.17), of the stress relation are useful in 
discussing experiments involving equilibrium states for which the temperature 
1s controlled, while the forms, (7.1), (11.10), and (12.3), are appropriate for 


discussing experiments involving equilibrium states for which the entropy is 
controlled. 


Thermostatics of Continua 119 


14, Thermal stability 
Consider a body # and a global thermomechanic state {f, 7} of Z, defined by 
a configuration f of # and an entropy distribution 7 of B; (cf. § 2). Let the 
caloric equation of state of the material point X of Z# be given by 


e(X) =&[F(X),(X);X]. (14.4) 


Here (X) is the deformation gradient at X of the configuration f relative to 
some reference configuration f,. We do not assume that the body is homogeneous, 
and hence the function € may depend explicitly on X as indicated in (14.1). 
The total entropy of Z in the given state is defined by 


H =f 4(X)dm (14.2) 
B 
and the total internal energy of B by 
E =f 8[F(X),0(X);X]am. (143) 


In this section we shall deal with situations in which the deformation gradient 
F(X) is kept fixed at each X while the entropy field 7 =7(X) is varied. It will 
not be necessary to make the dependence of ¢ on F explicit, and the following 
abbreviated notation will be convenient: 


& (F(X), 4(X);X] =e(X,7(X)). (14.4) 


Definition of thermal stability. Let {f, 1} be a state of B and let E and H 
be, respectively, the total internal energy and total entropy corresponding to the state 
{fn}. We say that {f, n} is a thermally stable state of B if every other state {f, n*+, 
with the same configuration as {f,n} and the same total entropy as {f, y\, 


H* = f 4*(X)dm=H=Jn(X)dm, (14.5) 
B B 
has a greater total internal energy than the state {f, n}\; i.e., 
Et =f 0(X,9%(X)) dm >E =f e(X,(X)) dm. (14.6) 
B B 


We give another condition, equivalent to the one given above, which could 
also be used to define thermal stability. 


Theorem 11. A state {f, 7} of B is thermally stable if and only if every other 
state {f, n*} with the same configuration as {f, n} and the same total energy as {f, n}, 


E* = f ¢(X,n*(X))dm=E=f e(X,n(X)) dm, (14.7) 
B B 
has a lower total entropy than the state {f, n}; i.e., 
H = f4(X)dm>H* = f *(X) dm. (14.8) 
B B 


Proof. We show that the hypothesis of Theorem 11 is necessary for thermal 
stability by showing that if there exists a state {f, 7} (with 7, not identical to 7) 
which obeys the equation (14.7) of Theorem 11 but violates (14.8), then there 
must exist a state {f, 7} (with 7, not identical to 7) which obeys the equation 


420 BERNARD D. COLEMAN & WALTER NOLL: 


(14.5) of the definition of thermal stability but which does not obey (14.6). Let 
n be the entropy density distribution which obeys (14.7) but not (14.8); we 


construct 7 as follows: Bee 

: No(X) =m (X) + Hea (14.9) 
where H, is the total entropy corresponding to 7,. The total entropy correspond - 
ing to 7 1S H, =f mlX) Lp (14.10) 


Hence, the state {f, 7 obeys the equation (14.5) of the definition. We have 
assumed that 7 is not identical to 7 and that H, 2H. If Hj =H, then m, is the 
same as 7, and hence different from 7. In this trivial case of 72=%,, it follows 
from the fact that 7, obeys (14.7) that 


f e(X, (X)) dm = Jf e(X,m(X)) dm =f co eae) dm. (14.11) 
B B Z 


If H,>H, then 7.(X)<7,(X) for all X in Z. It then follows from Postulate II 
of §8 and the assumption that 7, obeys (14.7) that 


f e(X,(X)) dm < f e(X,n(X)) dm il es gh 2)) dm. (14.12) 
a a J 


It is clear from (14.12) that 7 is not identical to 7. Hence, whenever H, =H, 
we have, by the construction (14.9), a state {f, 72} with 7, different from 7 but 
with H,=H and 

Eas Jel Xatia) dimes) ailetaney eh (14.13) 


Thus, a violation of the hypothesis of Theorem 11 implies the existence of a 
state different from {f, 7} which obeys (14.5) yet violates (14.6). 

The sufficiency of the hypothesis of Theorem 11 is proved analogously by 
starting with a state which obeys (14.5) of the definition, but not (14.6), and 
then using Postulate II to construct a state which obeys (14.7) of the theorem, 
but which violates (14.8). 


The main result of the present section is the following theorem: 


Theorem 12. A state {f, 7} of a body is thermally stable if and only if it is 
of uniform temperature, i.e., if and only if 


} = e,(X,7(X)) (14.14) 
1s a constant, independent of the material point X. 


Proof. To show the necessity of #=constant, we observe that, by (14.6), 
the function 7(X) is the solution of the variational problem 


J e(X,9*(X)) dm = Minimum (14.15) 
B 

subject to the constraint (14.5). It follows that the first variation of 
Sf [e(X,9*(X)) —an*(X)] dm (14.16) 


must vanish for 7*—7». Here « is a constant Lagrange parameter. We obtain 


a = &,(X,7(X)) = 9 =constant. (14.17) 


Thermostatics of Continua 121 


To prove the sufficiency of #=constant, we substitute the function values 
F(X), 7(X) and 4*(X) for F, 4 and 7* in the convexity inequality (8.5). Using 
the abbreviation (14.4) and the equation (14.14), we get 


€(X, 9*(X)) — e(X,(X)) — [n*(X) —9(X)] OZ0. (14.18) 


This inequality must be strict for some X if 7* and 7 are different continuous 
functions. If # is a constant and if (14.5) holds, then integration of (14.18) over 
the body & gives the inequality (14.6), which proves that {f, 7 is thermally stable, 
Cacia: 


15. Mechanical stability 
Consider a state {f,7} of a body #. According to Postulate I of §8 it is 
possible to find a temperature field # and a stress field S such that every material 
point of # is in thermal equilibrium for the force temperature field defined by 
Sand #@. In fact, S and # are given by the stress relation (7.1) and the temperature 
relation (7.2), respectively. If a field of body forces b is given, then the state 
{f, 7} will be a state of mechanical equilibrium if Cauchy’s condition 


Div S+eb=0 (15.1) 


holds. If {f, 7} is such that every material point is in thermal equilibrium, it 
is always possible to choose b such that the state {f, 7} is a state of mechanical 
equilibrium. We need only to define b by (15.1). We say that the fields S, #, 
and b, given by (7.1), (7.2) and (15.1), make {f, 7} a state of equilibrium. We 
call S,# and b, respectively, the stress, temperature, and body force fields of 

We investigate the possible meaning that can be given to the statement that 
an equilibrium state {f, 7; is stable. First, we require that it be thermally stable 
which, according to Theorem 12, means that the temperature # must be uniform. 
In addition, we require that some condition of mechanical stability be satisfied. 
One must distinguish between various types of zsothermal mechanical stability and 
adiabatic mechanical stability. 

In the case of isothermal mechanical stability, one compares the given 
equilibrium state {f, 7} with a class of states {f*,*} corresponding to the same 
uniform temperature o=0(F, y) as the given state. Each of these states is 
characterized by its configuration f* alone, because the corresponding entropy 
distribution is then determined by 


n =i (F*, 0). (15.2) 


External forces or boundary conditions must be prescribed for each of the 
comparison configurations f*. The configuration f is called stable if the increase 
in the total free energy would always be greater than the work done on the 
body by the external forces if the configuration were to be deformed into any 
of the comparison configurations f*. We give more precise definitions in two 
special cases. 


Definition of isothermal stability at fixed boundary (IFB stability). An 
equilibrium state {f, 7} 1s called IFB stable if {f, 4} has a uniform temperature 3 


122 BERNARD D. COLEMAN & WALTER NOLL: 


and if for every state {f*, 1} which satisfies the following conditions : 
(a) f* lies in a prescribed neighborhood™ of f, 
(b) f*(X) =f(X), when X belongs to B, (15.3) 
(c) the temperature corresponding to {f*,n} 1s equal to 0 for all X in B, 

the following inequality holds - 


val{P URE iS EO SORES et) fa ae (15.4) 


Here & is the boundary of &, and p(F) is an abbreviation for 
p(F) =P (F(X), 8; X); (15.5) 


F*(X) and F(X) are the deformation gradients at X for the configurations f * 
and f, respectively, both computed relative to the same fixed reference configura- 
tion. As in §14, we do not assume that the body is homogeneous, and hence 
the function # may depend explicitly on X. 

We say that {f, 7} is strictly IFB stable if the inequality (15.4) is strict 
whenever {f*, 7*} obeys (a), (b) and (c) and is such that /*=-/. 

Note that the surface tractions do no work if the boundary is fixed and that 
—ff*-bdm is a potential of the work done by the body forces if these are 

B 


held at their values B(X) in the equilibrium state {f, 7}. 


The type of stability considered is affected by the prescription of the neigh- 
borhood in the requirement (a) of the definition of IFB stability. A global state 
may be stable with respect to some (small) neighborhood without being stable 
with respect to other (larger) neighborhoods. 


Definition of isothermal stability at fixed surface tractions (IFT stability). 
An equilibrium state {f,n} 1s called IFT stable if {f,} has a uniform temperature 
8 and if for every state {f*, 7} which satisfies the following conditions : 


(a) {* lies in a prescribed neighborhood of f, 
(b) the temperature corresponding to {f*, n} is equal to & for all X in Z, 
the following inequality holds: 


eg en) —p(F) —b-(f*—f)}dm —J ({*—f)-SndAZo. (15.6) 


Here & is the boundary surface of the region occupied by & in the configuration f; 
dA is the element of that surface; and n is the exterior unit normal. 


Note that — f f*-SndA is a potential of the work done by the surface 
B 


tractions if they are held at their values in the equilibrium state {f, 7}. 

An IFT stable state is always also IFB stable. This follows from the fact 
that the surface integral in (15.6) gives no contribution if the boundary condition 
(15.3) holds, so that the inequalities (15.4) and (15.6) become the same in this case. 


* A neighborhood of a configuration is defined by the metric 
O(f, 1*) shee {|f*(X) —f(X)| + |F*4(X) F(X) — I} 


over the space of all configurations. 


Thermostatics of Continua 123 


If the inequality (15.6) holds for all states which obey items (a) and (b) of 
the definition of IFT stability and is, furthermore, a strict inequality for all 
such states for which F*(X) =+-F(X) for at least one material point X, then we 
say that {f, 7} is strictly IFT stable against deformations and rotations. For in 


that case (15.6) can reduce to an equality only if f* is related to f by a simple 
rigid translation. 


To investigate adiabatic mechanical stability, one compares the given equi- 
librium state {f, 7} with a class of states which correspond to the same total 
entropy as {f, 7}. We again consider two special cases. 


Definition of adiabatic stability at fixed boundary (AFB stability). An 
equilibrium state {f, 7} ts called AFB stable if {f. 7\ is thermally stable and if for 
every state {f*,*} which satisfies the following conditions: 


(a) f* lies in a prescribed neighborhood of f, 
(b) f*(X) =f(X), when X belongs to Z, 
c) fn*(X)dm=fn(X)dm, 
B B 
the following inequality holds: 
J {8 [F*(X), 0*(X); X] — 8 [F(X), 9 (X);X] —b-(f*—f)}dmao. (15.7) 
B 
If the inequality in (15.7) is strict for all {f*, 7*} satisfying (a), (b) and (c) 
and for which f*=-f, then we say that {f, 7} is strictly AFB stable. 


Theorem 13. A thermally stable equilibrium state {f,n} 1s AFB stable if and 
only if for every state {f*,*} which satisfies the following conditions: 
(a) f* les in a prescribed neighborhood of f, 
(b) f*(X) =f(X) when X belongs to Z, 
ih gio VIR ea bs adc {é (F(X), (X);X)—b-f}dm, (15.8) 
the following inequality holds: 
PETA) gees Nl Sp (15.9) 


Furthermore, {f, n} 1s one AFB stable yf and only if (15.9) ts a strict inequality 
jor every state {f*,n*} + {f, n} obeying (a), (b) and (c). 

We omit the proof of Theorem 13 because it is analogous to that of 
Theorem 11. Of course, the validity of Theorem 13 requires the assumption of 
Postulate II. 

Definition of adiabatic stability at fixed surface tractions (AFT stability). 
An equilibrium state {f,\ is called AFT stable if it is thermally stable and if for 
every state {f*, n*} which satisfies the following conditions: 

(a) f* is in a prescribed neighborhood of f, 


b)_f at(X) dem =f (X) dm, 
the following aa holds: 
eva (X); X] — é[F(X),n(X); mes —f)}dm — 


neil (fF = bs pa aa 
B 


424 BERNARD D. COLEMAN & WALTER NOLL: 


It will be noticed that a state which is AFT stable is always AFB stable. 


If the inequality (15.10) holds for all states which obey (a) and (b) and is a 
strict inequality for all such states for which F*(X) +-F(X) for at least one X, 
then we say that {f, 7} is strictly AFT stable against deformations and rotations. 

It is clear that, in analogy to Theorem 13, an alternative, but equivalent, 
definition of AFT stability can be formulated in which a stable state is defined 
to be one of maximum entropy among all those states for which (15.10) reduces 
to an equality. 

The definitions of IFB, IFT, AFB and AFT stability given above are ap- 
plicable only to those physical situations in which the body force field b =b(X) 
is independent of the comparison configuration f*. If one is interested in studying 
cases in which the body force on X depends on X and is also a functional of f*, 
one can modify the definitions of stability by connecting the comparison state 
[* to f by means of a continuous one-parameter family f,, O<s<=1, fy=f, h=f* 


and replacing the term 
asia — fbf —fyam 


in (15.4), (15.6), (15.7), (15.8) and (15.10) by 
1 
— f fo(&,p)- EEA dsam. 
Bo 2 


If the body force on each material point is derivable from a single-valued poten- 
tial, then the integral exhibited above is independent of the paramatization, 
and is simply the difference in the potentials at f and f*. 

In the definitions of IFT and AFT stability, we assumed that not only the 
body forces but also the contact forces at the surface do not depend on the 
comparison configuration. One can also study, in a way analogous to that outlined 
above for the body forces, those cases in which the surface tractions depend on 
the comparison configuration. 


Theorem 14. A state which has isothermal stability of a certain type also has 
adiabatic stability of the corresponding type. 

Proof. Consider a state {f, 7} which has a uniform temperature # and which 
has isothermal stability of a particular type. Let f* be a configuration which 
satisfies the boundary conditions, if any, for the appropriate comparison con- 
figurations. Define the entropy field 7, by 


m(X) =i (F*(X), 9), (15.11) 


where /* is the deformation gradient field corresponding to the configuration Te 
By (13.1) we have 


P(E*, 8) — pF, 8) = &(E*, m) — &(F,) — (m—n) 9. (15.12) 
Here F corresponds to f. Let 7* be any entropy distribution satisfying the 
condition _ 
a n*(X) dm =f n(X)dm, (4543) 
J B 


which is required for comparison states in adiabatic stability. Define the field 


Bb b x 
z B =8(F*, n*) — 8 (F*,n,) — (n* —n) 0. (15.14) 


Thermostatics of Continua A125 


From (8.5) we get B(X) =0 
for all X. From (15.12) we have 


ESB) PEO) — BE 9*) = 8 (EF, 9) = B —(n* — 9) Oy (15.45) 


We integrate (15.15) over Z. According to (15.13) we get no contribution from 
the term — (7*—1)#; hence, since B is non-negative, 


a [p(F*, 0) —p(F, 8) | dm = J [é(F*, n*) — E(F,n)| dm. (15.16) 


Since the work W done by the external forces in going from f to f* is the same 
in adiabatic and isothermal stability, it follows from (15.16) that if 


[BE — 98,0] am —W 15.17) 


is non-negative, then 
fi[2(E*, 7") — &(F,7n) | adm — W (15.18) 
B 


is non-negative (and strictly positive when (15.17) is strictly positive). Hence, 
the isothermal stability of {f, 7} implies the corresponding adiabatic stability for 
fen} qed. 

Although in writing our proof of Theorem 14 we have used a notation which 
imphes that # is homogeneous, it is clear that the same argument is valid when 
B is not homogeneous. 

It appears to us that the converse of Theorem 14 need not be true; 7.e., an 
equilibrium state may have adiabatic stability without being isothermally stable. 


16. Gibbs’ thermostatics of fluids 
We now consider a type of stability which was proposed by GrBBs* for 
fluids free from body forces. Gipgs states** that he had in mind a physical 
situation in which the fluid is “‘enclosed in a rigid envelop which is non-conducting 
to heat and impermeable to all the components of the fluid’. A body which 
may be regarded as being in such an envelop is usually called an “isolated system’. 


Definition of G stability***. An equilibrium state {f, 7} of a fluid body B 
ts called G stable if the following condition is satisfied. Let {f*,*} be any other 
state with the same total volume and the same total entropy as {f, 7}, 


ph v® dm ee ih n* dm Se (16.1) 


* See the section of [1] which is entitled ‘Internal stability of homogeneous 
fluids as indicated by the fundamental equations’’, (b), pp. 100—115, particularly the 
subsection entitled ‘“‘Stability with respect to continuous changes of phase’ (b), 
PPadOo= tite 

ASN (ON Da LOO: 

xxx In this definition we again restrict ourselves to those physical situations in 
which fluctuations in chemical composition are surpressed. We have in mind situations 
in which chemical reactions are prohibited and in which the fluid is either homo- 
geneous or does not allow diffusion. For fluids the homogeneous case is the one of 
practical importance. Situations in which flow is permitted but diffusion is prohibited 


are rare. 


4126 BERNARD D. COLEMAN & WALTER NOLL: 


then {f, } has a lower total internal energy than than 
J &[o*(X), n*(X); X] dm> fe [v(X),n(X);X] am, (16.2) 
B B 


unless v*(X) =v(X) and n*(X) =n(X) for all X in B. 

In (16.1) » and v* denote the specific volume fields for & corresponding to 
the configurations f and f*. 

In the following alternative definition f and ¢ are taken as the independent 
variables, and the permitted comparison states are such that the total internal 
energy and total volume of the body are conserved during the variations. This 
alternative formulation may suggest to the reader why G stability is regarded 
as being appropriate for discussing the physics of isolated systems composed of 
fluids: 

Alternative definition of G stability. An equilibrium state {f, e} of a fluid 
body B is called G stable if any other state {f*, e*} with the same total volume and 
the same total internal energy as {f, €}, 


fotdm=fudm, fet*dm=fedm, (16.3) 
B B B 2 
has a lagher total entropy, 
Sj [v*(X), e*(X); X] dm < fj [v(X), e(X);X] dm, (16.4) 
B B 


unless v*(X) =v(X) and e*(X) =e(X) for all X in B. 

The function 7 in (16.4) is obtained by solving e=€(v, 4; X) for 7, which 
is possible in a unique way because é@ is strictly increasing in 7. 

The proof of the equivalence of the two definitions of G stability is analogous 


to the one given for Theorem 11 of § 14 in the case of thermal stability; one 
must again use Postulate II of § 8. 


The main result of this section is 


Theorem 15. An equilibrium state {f,\ of a fluid body is G stable if and 
only rf rts temperature and pressure ave uniform. 


Proof. To prove that the condition is necessary we observe that the func- 
tions v, 7 are solutions of the variational problem 


J E(v*, n*; X) dm = Minimum (16.5) 
B 
subject to the constraints (16.1). Therefore, the first variation of 
J [E(v*, n*; X) — An*—wo*|dm 
B 


must vanish for v*=v and 7*=y. Here A and yw are constant Lagrange para- 
meters. It follows that 


é,(v,7;X)=A=constant, &,(v,7; X) =u =constant. (16.6) 
Hence, by (8.2) and (11.11), both the temperature, & =€,(v,7; X), and the 
pressure, P= — &,(v,7; X) are uniform over &. 


To prove the sufficiency of the condition of the theorem, we assume that @ 
and p are uniform and that (16.1) holds. From the convexity inequality (11.16), 


Thermostatics of Continua A277 


the inequality (11.9), and the fact that v=v3, y>0, is a convex function of v, 
one can easily infer that €(v, 7) must be convex in v and 7. Hence, the inequality 


&(v*, q*; X) — E(v*, 9; X) — (v* — v) 8, (v, 9; X) —(q* — 0) 8, (v.93 X) 20 (16.7) 


is valid at all material points X in #; (16.7) cannot reduce to an equality for 
all X unless v(X) =v*(X) and 4(X) =7*(X) for all X. Since p=—ae,(v, ) and 
0 =€,(v, 4) are independent of X, integration of (16.7) over # gives 


J {E(v*, n*; X) — E(v, 0; X)}dm + p f (vt — v)dm — 9 f (n*—)dm>o. 
a a a 


The condition (16.1) states that the last two terms vanish and hence that (16.2) 
holds, q.e.d. 

In his discussion of the stability of homogeneous fluids, G1BBs used a defini- 
tion of stability which is identical to what we have called G stability, except 
that he did not demand, as we do, that {f, 7} be an equilibrium state*. GrBBs 
was able to prove that uniform values of €,(v,7) and €,(v,7) are necessary for 
his stability and, furthermore, that the inequality (16.7) is also necessary. He 
also realized that the constancy of €, and €, over # and the validity of (16.7) 
are sufficient for his stability. If he had gone a step further and postulated that 
for homogeneous fluids stable states exist for every value of v and 7 for which € 
is defined, he would have obtained (16.7) as a property of the function €. Such 
a procedure, however, cannot yield the statements, made in Theorem 6, that 
—&, is positive and that € is jointly and strictly convex in y and 7. 


We conclude with 

Theorem 16. An equilibrium state {f,n} of a fluid body B is G stable if and 
only tf both of the following conditions hold: 

(a) The temperature corresponding to {f, n} 1s uniform. 

(b) Any other state {f*, *} with the same total volume, 


fotdm=fvdm, (16.8) 
B B 
and the same uniform temperature 0 has a higher total free energy, 
[DO (v*, 9; X)dm> fP(v,9; X) dm, (16.9) 
B B 


unless v*(X) =v(X) for all X in B. 

Proof. The proof that the conditions (a) and (b) are sufficient for the G 
stability of {f, 7} is completely analogous to the proof of Theorem 14 of § 15. 

The necessity of the condition (a) for the G stability of {f, 7} follows from 
Theorem 15. To prove that (b) is necessary we assume that {f, 7} is stable. 
We consider another state {f*, 7*} which obeys (16.8) and which has the uniform 
temperature #. Since v =»* is a convex function of » for y>0, and y, (v, ¥) <0, 
the inequality (13.15) implies that 


p(v*, ; X) —plv, 0; X) — (vt — v) P, (v, 0; X) =O; (16.10) 


* Giggs does not use either our PostulateI or our definition of (local) thermal 
equilibrium. 


128 


BERNARD D. CoLeEMAN & WALTER NOL: Thermostatics of Continua 


(16.10) cannot reduce to equality for all X unless v(X) =v*(X) for all X. Now, 
since we are assuming that {f, 7} is stable, it follows from Theorem 15 and (13.13) 
that ~,(v, 8; X) is independent of X. Thus, by (16.8), if we compute the mass 
integral of (16.10) over Z&, the last term on the left makes no contribution, and 
we get (16.9). Hence, when {f, 7} is G stable, the condition (b) is valid, q.e.d. 

This theorem shows that for G stability of fluids adiabatic and isothermal 
stability are equivalent. 


Acknowledgement. This research was supported in part by the Air Force Office 
of Scientific Research under Contract AF 49(638)-541 with the Mellon Institute and 
by the National Science Foundation under Grant NSF-G 5250 to Carnegie Institute 
of Technology. 


References 


[1] (a) Grpss, J. W.: On the equilibrium of heterogeneous substances. Trans. Conn. 


Acad. 3, 108—248, 343—524 (1875—1878), or (b) The Scientific Papers of 
J. WitLarpD Grss 1, 55—372, particularly 55—62 and 100—115, Longmans, 
Green, 1906. 


|] Dunem, P.: Dissolutions et Mélanges. Travaux et Mémoires des Facultés de 


Lille 3, No. 2, 1—136 (1893). 


] TRUESDELL, C.: Das ungeléste Hauptproblem der endlichen Elastizitatstheorie. 


Z. angew. Math. u. Mech. 36, 97—103 (1956). 


] HapamarD, J.: Lecons sur la Propagation des Ondes et les Equations de l’Hydro- 


dynamique. Paris 1903. 


] Ericxsen, J. L., & R. A. Toupin: Implications of Hadamard’s condition for 


elastic stability with respect to uniqueness theorems. Can. J. Math. 8, 
432—436 (1956). 

DuueEm, P.: Recherches sur l’élasticité. Troisitme partie. La stabilité des 
milieux élastiques. Ann. Ecole Norm. (3) 22, 143—217 (1905). 

No ti, W.: A mathematical theory of the mechanical behavior of continuous 
media. Arch. Rat. Mech. Anal. 2, 197—226 (1958). 


] Nott, W.: The fundations of classical mechanics in the light of recent advances 


in continuum mechanics. Proceedings of the Berkeley Symposium on the 
Axiomatic Method, 266—281, 1959. 


] TRUESDELL, C.: The mechanical foundations of elasticity and fluid dynamics. 


J. Rat. Mech. Anal. 1, 125—300 (1952); 2, 593—616 (1953). 


] Nott, W.: On the continuity of the solid and fluid states. J. Rat. Mech. Anal. 


4, 3-—81 (1955). 


] Ramsey, N. F.: Thermodynamics and statistical mechanics at negative absolute 


temperatures. Phys. Rev. 103, 20—28 (1956). 


| CoLteman, B. D., & W. Noi: Conditions for equilibrium at negative absolute 


temperatures. Phys. Rev. 115, 262—265 (1959). 


] Baker, M., & J. L. Ericxsen: Inequalities restricting the form of the stress- 


deformation relations for isotropic elastic solids and Reiner-Rivlin fluids. 
J. Washington Acad. Sciences 44, 33—35 (1954). 


] Barta, J.: On the non-linear elasticity law. Acta Tech. Acad. Scient. Hungaricae 


18, 55—65 (1957). 


| Hiri, R.: On uniqueness and stability in the theory of finite elastic strain. 


J. Mech. Phys. Solids 5, 229—241 (1957) 


Mellon Institute 
Pittsburgh, Pennsylvania 
and 
Carnegie Institute of Technology 
Pittsburgh, Pennsylvania 


(Recewed August 24, 1959) 


The Formulation of Constitutive Equations 
in Continuum Physics. I 


A.C, PIPKIN & R. S. RIVLIN 


1. Introduction 

Many of the physical properties of materials can be expressed by relations 
between tensors. Ohm’s law, for example, describes the electrical conductivity 
of a material by means of a relation between two vectors, the field strength and 
the current density. The Navier-Stokes equation, a relation between two second 
order tensors, specifies the viscous properties of certain materials. Such relations 
as these are called “constitutive equations’’. 

Constitutive relations for deformable solids frequently involve quantities 
specifying the amount of deformation to which the material is subjected. If 
the deformation of a body is described by equations of the form 


x, = %;(Xp, t), (1.1) 


giving the rectangular Cartesian coordinates x; of each particle at time ¢ in terms 
of its coordinates X,, at some reference time ¢=0, then the deformation gradients 
0x,/0X; are measures of the deformation in the neighborhood of a given particle. 
As measures of rotation, they may be involved not only in relations describing 
deformable bodies, but also in equations relevant to ideally rigid materials. 
Relations of the general form 


Us ie Te hit ig. (0x,/OX,, vy’) , (1.2) 


specifying the components of a tensor was functions of these deformation gradients 
and the components of a number of vectors v' («=1, 2, ...,”), are encountered 
especially often, although dependence on the deformation gradients 0x;/dX; is 
frequently not taken into account explicitly if the material is regarded as rigid. 

Ohm’s law is a special case of (1.2) with w=»=1, the vector u being the current 
density and v™ the electric field strength. A generalized Hooke’s law has w~=2 
and y=0, so that the second-rank stress tensor u depends upon the deformation 
gradients alone. As further examples, in the case 4=yv=O the scalar u might 
represent the elastic strain-energy, a function of the deformation gradients, or, 
with ~=0 and »=2, u might be the free energy of a deformed material with 
electrical polarization v’) and magnetization v. In cases with w=v=1 other 
than Ohm’s law, the vectors u and v™) might represent respectively the heat 
flux and temperature gradient in a deformed thermal conductor, the fluid velocity 
and pressure gradient in a porous solid (Darcy’s law of diffusion), or the electrical 
polarization and electric field strength in a deformed dielectric. 

Arch. Rational Mech. Anal., Vol. 4 9 


130 A. C. Prpkin & R. S. RIVLIN: 


Voicr’s classical treatise [7] on crystal physics and Nyr’s more recent work [2] 
consider a great many relations which are special cases of (1.2). In these linear 
treatments, the deformation gradients are assumed to appear only through the 
components of the classical infinitesimal strain tensor. This assumption is 
usually valid if the deformations to which the material is subj ected are sufficiently 
small. It then arises from the fact that the form of the constitutive equation, 
in a rectangular Cartesian coordinate system, is unaltered by a simultaneous rigid 
rotation of the reference system and the whole physical system considered. 
Further restrictions are imposed on the form of the relations by any symmetry 
which the material may possess. In both Vorcr’s and NyE’s books, the investi- 
gation of these restrictions is considerably simplified by the linearity of the 
relations discussed. 

In the present paper, we eschew assumptions of linearity in the constitutive 
equation and regard f; ;,..;, in (1.2) as single-valued functions or as polynomials 
of arbitrary degree in the indicated arguments. We then investigate the restric- 
tions. which can be imposed on the forms of these functions by the requirement 
that the constitutive equation (1.2) be unaltered by a simultaneous rigid 
rotation of the physical system and the reference system. We next discuss how 
the further limitations which may be imposed on the form of the constitutive 
equation (1.2) by various assumed types of material symmetry may be derived. 

A number of particular cases of the results given in the present paper are 
already known, notably that in which w=0O and y=0 or 1 in (1.2) and that in 
which = 2 and »=0 (see, for example, [3] to [7]). 


2. Form-invariance under rotation of the physical system 


We consider that a body which is homogeneous at time ¢=0 undergoes a 
deformation described by (1.1). We suppose that some physical property of 
the material of which the body is composed is described in the coordinate system 
x by the constitutive equation (4.2), in which fj, ;,.. ;, are polynomial functions 
of the arguments. 

If we simultaneously subject the deformed body and the vector fields v 
to a common arbitrary rigid rotation, the tensor u will be subjected to the same 
rotation. This fact imposes a restriction on the forms which can be taken by 
the functions /;,;,..;, in (1.2). 

We assume that in the rigid rotation, the particle at x; in the coordinate 
system x moves to x; in the same system, where 


x, = 43; Xj (2.4) 


U5 Up = Aj Ay; = Oj 8, | a; ;| = 1. (2.2) 


and 


Let 6 be the vectors into which the vectors v are rotated, and let 7 be the 
components of 8 in the coordinate system x. Also, let m@ be the tensor into 
which u is rotated, and let the components of &@ in the systemiusberdpyieg 

We now use an auxiliary rectangular Cartesian coordinate system y, obtained 
from the system x by the rigid rotation to which the body is subjected. The 
point at x;in the system x has coordinates x; in the system y, and the components 


of the vector 0 are v‘*) in the coordinate system y. Similarly, the components 


General Constitutive Equations 1135 


in the system y of the tensor & are 4; ;,. in: Since the direction cosines of the 
axis Y; _ the coordinate system x are a;; (t=1, 2,3), the components of the 
vector ©) in the two coordinate systems ie related by 

y — a, 0)" (2.3) 


and the components of the tensor & in the coordinate system y are related to 
its components in the system x by 


Wig igen = Gyiy Gyig s+» Vp in Wry iu’ (2.4) 
Also, from (2.1) and (2.2) we obtain 
Ox; OX; 


ae ax (2.5) 
Applying the relation (1.2) to the physical situation existing after the rotation 
has taken place, we have 


Wii! tp = fi, iy. ip (O%,/OX,, Dae (2.6) 


all quantities being measured in the system x as prescribed. Using equations 
(1.2), (2.4), and (2.6), we obtain 


ewe ip (0x,/OX,, a Nes = Di, Gj, i by * Gin on 1 AOA OA gs 7) , (2:75 


where the arguments on the left- and right-hand sides are related by (2.3) and 
(2.5), for all a;; satisfying (2.2). 

Equation Q. 7) expresses a restriction on the forms taken by the functions 
lige qyomich rey be expressed in a more convenient form in the following 
ec: Let 0% ie =v+1,v+2,...,v+y) be w arbitrary vectors, with compo- 
nents v% and 3 in the coordinate Gato y and x respectively. Equation (2.3), 
which was originally written for «=1, 2,..., , holds for e=y+1,7+2,...,v+y 
as well. Multiplying (2.7) throughout ie “yet fe ate and using (2.3) 
and (2.2), we obtain 


ety) oft?) see Oye fi dg. tp (0x,/OX,, v) 
SS) a ea Alte dey eel (CEC ay ae) (2.8) 
malty (say). 


Equation (2.8) expresses the fact that F is a polynomial scalar invariant, under 
the proper orthogonal transformation group, of the vectors Ee ae 
in the coordinate system x are 0x,/0X1, 0x,/0X»_, 0x,/0X3, and oe ) (a=4, 2, 
y+ Ul). 

F is therefore expressible as a polynomial in the elements of an integrity 
basis for invariants, under the proper orthogonal transformation group, of these 
y-+u-+3 vectors. Such an integrity basis is given (see, for example, WEYL [8]) by 


G (a) O%; y(%) y(6) 


47? 0; oe 4 Us 
Ox; Ox; eam Ox; Ox; (a) 
OF pares, 2. 
“SISEOR, OXON OX, OX, (2.9) 
Ox; 
pa” OX u) vi), cain v0) 0, 


9* 


132 A.C. Pipkin & R. S. RIvLin: 


where G;; is defined by G., = O%e O%m (2.10) 


and e;;, is the alternating tensor of rank three. The invariants (2.9) are the 
inner products and scalar triple products of the »++-3 vectors. 
The invariants (2.9) are not functionally independent. If we use the notation 


Ox Ox. OX 
(v+u-+1) —_ Bee (»+H+2) = P and py? tet) == Big DAA 
My xt Ne 9 ud p ox bt 


then the invariants (2.9) may be re-written as 
veo) “and é,,,0 UP OY (OxP, ¥ —1,2,---,¥ ft 3)) (eee 


The following functional relations hold between these invariants (see, for example, 
H. Wevt [8], p75): 


(e; jn oie) gla) 1) (.%" ie ) y(72)) —_ y{Pr) y{%) y{Pa) Pa) vP) yb ) (2 13) 
Vi?) yl), yl) pF), Qs) yl) 
yl) yl), yl) y(Ps) yf) (Bs) y(n) p(B 
yl) yf), y(t) ylFs), yf) y (Pe), yl) p(B 
ul) yl), (2) yfPs), (29) y (Ps), pla) yl) Soils (244) 
p62) yF0), yl) yl), yl) yl), pH) p(B 


Taking o=o%,=y+ut+1, p=b.=rv+u+2, and y,=y,=ry+p+3 in (2.43), 
we have, with (2.10) and (2.11), 


Ox; Ox; OXp 2 
(ein aX, ax, oa =|G;;|=G (say). (2.15) 


Since for any deformation possible in a real material 


OX; OX; OX, 
Kay ehe aay a 


== 0, (2.16) 
we have, from (2.15), 
oe Ox; Ox; OX, 
11S Oka te kowae 


= Gi. (2.17) 


From (2.13) and the fact that / may be expressed as a polynomial in the 
quantities (2.12), it follows that F may be expressed in the form 
r+ u+3 
FSP D epg ve vo oteF, (2.18) 
opt 
(ap D) 4 a a 5 
meet i aud Qe’ are polynomials in Up) Up (x, B=1, 2,...,y+u+3). Taking 
o—=y+u ee Yr fe 2, Yg=v+u-+3 in (2.13), we see, va (2.10) and (44), 
that ¢,,U, Ug’ v ”G! is expressible as a polynomial in the quantities Veco 
(Geno ei ling seat et ae 3). Introducing this result into (2.18), we see that 


Ne asl teal) (2.19) 


where P and Q are polynomials in aN se ae (x, B=1, 2,...,y+mu+3). 


Genera Constitutive Equations 133 


Since, from (2.8), F is multilinear in the vectors vo (n=y+1,7+2,..., y+), 
we may express as the sum of a number of terms of the form 


R(P*4+G-1 0%), (2.20) 
where R is the product of at most w factors of the type ae ee (a=v+1,7+2,..., 
y+) and P* and Q* are polynomials in et ee (OPP ala Des 0) 


Taking a=f)=r+u+1, %=P.=7+U4+2, —=P=v+u+3, w=a, Bs=B 
in (2.14), we obtain with (2.10) and (2.11) and the notation 


7) 
Vio = Da vi), (2.21) 
the result 
Grey Gina is 
G1, Goo, Go3, We = (2 22) 
Gs 1> Gs 2> Gs 3? Vege os 
Vi, Vi”, A py) yl?) 
Whence, we readily obtain 
(8) G=4 Cee G, Gs al V;,°), (2223) 


If R in (2.20) is the product of 7 (<j) of the factors v) vl) («=v+1, 4-2, ..., 
y+), then with (2.23) and bearing in mind that G may be expressed as a poly- 
nomial in G,,, we obtain R=C-ER*, (2.24) 
where R* is a polynomial in G,, and V,%. Introducing (2.24) into (2.20), we 
see that = as 
F=G"(P+G*0Q), (2.25) 


where P and Q are polynomials in Goes Vio" (= 151251008, om) cand ye vg? 


(on B12, 5. b,10). 
Since F is multilinear in the triads v, P and Q must be multilinear in the 
triads Ve (a=v+1,v7+2,...,v-+y). We thus have 


Wa Vane We yee Van (Pin gy eG Oars. ti) (2.26) 
where P, 


Pie Org ea cle polynomials in G,,, ve (ate esa 
Ne vl?) (a, B=1, 2,...,”). It then follows from (4.2), (2.8) and (2.26), using the 
definition (2.21) of V{”, that 


pia et le Be aig es mee) ; 
Uj, Vo ee UL aX;, aX; OX jn Ji-s- Jp? (2 27) 
where = 
Fy, atte = ce 2 (B Jo ee Tb aF G 3 Q;, el (2.28) 


Alternatively, by employing the relation (2.23), we see that F,;,..;, and 
Qi, i,..i, IN (2.26), and hence in (2. adh may be regarded as polynomials in G,,, 


Veto 22.3, ) and Git Vise orn (2:27) ae of course, be regarded as a 
polynomial in G,,, Vi? («=1, 2, ve and G~ 
Mit ine (fo) epee, eS fearded as a single- axe rather than a polynomial, 


function of 0x None ee v, then equation (2.8) expresses the fact that I is 
a single- eral scalar invariant, under the proper gepesene transformation 
group, of the vectors 0x,/0X,, 0x,/0X,, 0x,/OX; and ue ) (a=4, 2,..., 9+). 


134 A.C. Pipkin & R. S. RIVLIN: 


F is therefore expressible as a single-valued function of the quantities (2.9), 
which form a functional basis for invariants of the vectors. Relations of the 
types (2.13), (2.21) and (2.23) may then be used ? obtain the result that / is 
expressible as a single-valued function of Gyq and Vi (a4, 2, ..3,?- i). Sincere 
defined by (2. of is multilinear in the ae “of (a=v+1,v+2,..., y+), 
CCU, at) dul re . OUp Bae is independent of vi Mo=y+ 1,v+2,...,y+m). Now 


oUF 
“9,,(V+1) 5,,(v+2) (vm) 
OVp, OU,, -+. Up), 


2.29) 
ay(rtl) at(v+2) aye+H) (2.2 
oUF ¢ Las e Vi, 4 Vou 


= Byer) ayer?) sayeth at) 9,042) 9,0°+H) 
OV, Ove OV, OU, U5, OUp, 


and aV,°+Nianeth, aver aub**), ..., VET dus are, from (2. a independent 
of yi) (n=y+1,7+2,..., y+). veri. ae yor GVes co ave, is 
independent of oy (2=v+1,v+2,...,v+ ) and F is multilinear in the triads 
Vi) (x=v+1,7+2,..., y+). We then obtain from (1.2) and (2.8), using the 
definition (2.21) of UA the expression (2.27) for 4; ;,4,, i which F,;,;, is a 
single-valued function of G,, and VES! (oat, Deo. Ble 


3. Symmetries of the material 


We shall now consider the further restrictions imposed on constitutive equa- 
tions of the form (1.2) by any symmetry which the material may possess in its 
initial state, in which it is undeformed and v’=0. The symmetry of the material 
is characterized with reference to a rectangular Cartesian coordinate system x 
by a group of transformations {S} which is a sub-group of the full orthogonal 
group. Each transformation of the group {S} transforms the coordinate system x 
into an equivalent coordinate system x (say), in which the constitutive equation 
takes the same form as in the system x. Let the systems x and x be related 
by a generic transformation of the group {S} 


X= S57 Xj (3.1) 
Then : 
Si, Sik = $75 Sag = Oj (3.2) 


The constitutive equation for the material must be form-invariant under 
the transformation (3.1). 

We have seen that the constitutive equation (1.2) must be expressible in 
the form [c/. equation (2.27) | 


0x%;, O%;, OX in 


os ie == . A 7 (x) 

Wists. ip OX;, On OX jy Tp (Gig, kB ) , (3.3) 
where Ji, ;,.;,, 18 a single-valued function of its arguments Gagan Vol Mote 
1, 2,...,¥). The form-invariance of (3.3) implies that 

= r OX, Oxi: OX in Fe) 

Gi suai gagapmieatasy a haleelenee)) (3.4) 

where vs 
Uj, tye ty 7 Sivjt Si, qa 8 Sin ij, Vu Wirtorne I? 


= $;;X; and ih ) == §, pul. 


General Constitutive Equations 135 


G,, and V, are defined by 


Gyq= et and Hoe) P_, (3-6) 
aX, OX, OX» 


It is easily seen ffom (3.1), (3.2), and (3.5), with the definitions (3.6), (2.10), 
and (2.21), that 


Coq eae Spm Sgn Guy and Ave ix Spm ose (3.7) 


so that G,, and Gy, are the components of a second-order symmetric tensor 
in the systems x and % respectively, and Be! and ye are the components of a 
vector in these systems. 


From (3.3), (3.4), and (3.5) we obtain 


OXx;, OX}, OX jy 


oh Geers tl = ee He 7 (a) 
Senet rsd OX, OXp, OX py, phiteigh ear I) (3.8) 
2 S Bes 3. 
Oe een ee (G,,, Vir 
aX;, aX, OXjn NiJewIu\~ pq? 
Equation (3.8) expresses the restriction imposed on the functions F;,_,,, in 


(3.3) by the requirement of form-invariance under the transformations of the 
group {S}. This restriction may be developed in a more convenient form in the 
following manner. Let yi) and 9) (na=v+1,v+2,...,»+) be the components 
of w arbitrary vectors in the systems x and *% eee These components 
are related by (3.5). Multiplying (3.8) throughout by o)?tY o?*?) ... at") and 
using (3.5), (3.6), and (3.7), we obtain 


VM VE) VTE Fed a (Gy g VA9) 


Ute. Pq? 


Pas Veau Ve) a Tete & ey (G 


yt Pq? 


i) =F (ay, 9 


where «=1, 2,...,v. This relation expresses the fact that F is a scalar invariant 
of the symmetric second-order tensor G and the »+y vectors Vv (Gee 
y+), under the group {S} of symmetries of the material. 

We shall assume that in (4.2), f;,i,..;, 18 a polynomial in the arguments 
0x,/OX, and se (delice, 3) PWe Nave seen that, im this case, Fo); 1is-a 
polynomial in G,,, V{" («=1,2,...,”) and G~#. Since G~# is unaltered by 
any orthogonal transformation, we may treat it as a scalar. Then, F may be 
expressed as a polynomial in G~* and the elements of an integrity basis for 
invariants of Gand V™ («=1,2,..., +) under the group of transformations {S}. 


et, tay ..., ly be the elements of such a basis which do not depend on 
the vectors V™ (2=»+1, at 2 ...,¥+ym) and let f,, Jo,..., Jy be the invariant 
products, multilinear in V™ (w=v+1,7+2,...,y+), which can be ae 


from those elements of the integrity basis Sine depend on the vectors V™ 
(a=v+1,7+2,...,y+). Since, from (3.9), F is multilinear in the vectors V™ 
(gy 1 v2, 3, vm), It saat be expressed in the form 


fii SF End iho Iy, G4) (3.10) 


where the functions F“) are polynomials. 


136 A. C. Prpxin & R. S. RIVLIN: 


From (3.9) and (3.10), we obtain 
OV 


Ei, iu = ppeE gy) pHOAO 
K: 2 seve (3.11) 
ue ar J. A © 
= SS Seay ee BONE Laer lage 
aver) aver?) ave 


This result expresses the restriction imposed on the functions F,,..;,, 10 (3.3) 
by the condition of form-invariance under the group {S} of symmetry trans- 
formations for the material. The problem of determining the limitations imposed 
by form-invariance has thus been reduced to the problem of determining an 
integrity basis for invariants of a second-order symmetric tensor G and y+ 
vectors V under the group {S} of symmetry transformations of the material. 


4. The anisotropic tensors 


Let x and % be two rectangular Cartesian coordinate systems related by a 
generic transformation (3.1) of the group {S}. Let G,; and VEO fee 4 Dee 
be the components in the system x of a second-order symmetric tensor G and » 
vectors V™; let G;; and V;* be their components in the system %. If @ is a 
polynomial scalar invariant in the components of G and V™, then a typical 
term P in @ has the form 


IP 6 


W415 Jo + tRIRR, Ry. kp ly ly... lp, mb cF ny Ny... WP 


Mend eid OVE 


Nh 2je°** 7trir 


1 ap 
Pa nay 


x V2) V2) Py v2) (4.1) 
x VEO Ve. vei 
in the coordinate system x, and the form 
P= Ki iytets tp JR Ry Ry RP Ly Ly Dp soon My Myon MP, 
XG GRE Grain 
SP a 
x 7) V2) ~ Fe 3:2) 


x A Vi“) ... Ve) 
in the coordinate system x. In (4.1) and (4.2), the «’s are constants independent 
of Gand V™, 

From (4.1) and (4.2) we obtain, with (3.1), 


Mist tates tp ie hy ky kp hy ly dp pate Ny... Mp 


=a Sips Siva Si, Pe Sip ie Sip PR Sinaqr Shir, Sk, oe ee Skp UP. 
ea 

or. 
Shey Shy sy Pipisp erareiere Smith Smyty +++ Sup tp, 


\ 
x Lp a, P2 dae PRIRM1 2+. VP Sy So eee SP vvreee t,t,...¢p ° 
" 2 2 


General Constitutive Equations Lat 


Thus, if we regard Mis jy..mp, AS the components in the coordinate system x of 


a tensor of order 2k +P, +P,+---+P,, its components in the coordinate systems x, 


related to the system x by the transformations of the group {S}, are also Misnn tng’ 
The «’s are then said to be anisotropic tensors* [9] for the group {S}. It has 
been shown that the «’s may be expressed as the sum of a number of terms, 
each of which is an outer product formed from a finite number of basic anisotropic 
tensors for the group {S}, the coefficient of each of these terms being a constant. 
It has further been shown how the basic set of anisotropic tensors may be found 
from a knowledge of an integrity basis for invariants, under transformations 
of the group {S}, of an arbitrary number of vectors. When this basic set of 
anisotropic tensors has been found it can be used to determine an integrity 
basis, not necessarily finite, for invariants under the group {S} of other variables 
than vectors. In the following sections this procedure is employed to determine 
integrity bases, under various groups of transformations, for invariants of a 
symmetric second-order tensor and an arbitrary number of vectors. 


5. Integrity basis for the full orthogonal group 


In the case when the transformation group {S$} is the full orthogonal group, 
the basic set of anisotropic tensors consists of the Kronecker delta 6;;. Then, 
in (4.1), 0, jyamp, MAY be expressed as the sum of outer products of Kronecker 
deltas with constant coefficients. This can, of course, be done only in the case 
when 2R+4+h,+:--: 1%, is even. Consequently, terms for which 2R+h + P,+ 
----+P, is odd cannot occur in the expression for a polynomial scalar invariant 
under the full orthogonal group. In the expression for Oi, jy..mp, aS the sum of 


outer products of Kronecker deltas with constant coefficients, a typical term is 


C Op, bs Obs, eBoy One Pon (5.1) 


where C is a constant, 2N=2R+A+P/,+---+P,, and p,....f2n forms a 
permutation of the subscripts on the G’s and V’s occurring on the right-hand 
side of equation (4.1). With this interpretation of f, p.... pon, 


On 105 fe tOP pep ie Cag Ota 
x VA) Veins ees 
2 2 
RE Vee (5.2) 


x ye Ve ya Va, 
may be expressed as a product of the invariants 
eG anda OV Oe VE OpA Dern, ik 0 A, 2). << fl) (923) 


Consequently, any polynomial scalar invariant under the full orthogonal group 
of the symmetric second-order tensor G and the yw vectors vo (oeadoe 2 ener) 
may be expressed as a polynomial in the invariants (5.3). 


* In the particular case when the group {S} is the full or proper orthogonal group 
they are isotvopic tensors. 


138 AG Pipkin cy S Riva: 


The Hamilton-Cayley theorem for a 3x3 matrix G may be expressed in 
the form 


G3—G? tr G+4G [(tr G)?—tr G2] —3 [2 tr G°—3 tr G tr G? + (tr G)3] I=0. (5.4) 


Using this relation we can express tr G” as a polynomial in trG, trG?, and tr G®, 
and VG“ V as a polynomial in V™ G”V) (M=0, 1, 2) and these traces. 
It follows that any polynomial scalar invariant under the full orthogonal group 
of the symmetric second-order tensor G and the « vectors vo (Cae Oe Fee 
may be expressed as a polynomial in the invariants 


trG (M=1,2,3) and VOG™V® (M=0,1, 2). (5.5) 


6. Integrity basis for the proper orthogonal group 
The group of transformations {S} characterizing isotropic materials with no 
center of symmetry is the proper orthogonal group, consisting of all orthogonal 
transformation with | s,;|=-++1. 
The basic anisotropic tensors of the rotation group are the Kronecker delta 
and the third-order alternating tensor. Since 


On 0: hi? 0i, hh 
Ch ists inty = | Onta> Olefer Otnde |? (6.1) 
0 fo) fe) 


73? V2 73? 1313 


no sum of outer products of these tensors need contain e;;, to higher degree 
than the first. In the expressions for «;;,, in (4.1), a typical term may then 
hi MB, 


be either 

C1 Op, 6. Ops +++ Opay—1 Paw (6.2) 
or 

Ce Op. bs Opa, tee orn bom “bom+1 Pom+2 Poms? (6.3) 
depending on whether 2R+Ah+h+:::+P,=2N or 2M+3, 1.e. on whether 
the total number of subscripts is even or odd. In either case the subscripts p 
form a permutation of the subscripts occurring on the right-hand side of (4.1). 

Replacing Misi OR, in (4.1) by (6.2) and (6.3) in turn, we see that the poly- 

nomial invariant /’ must be a polynomial in the following invariants: 


V@ GY Ve) tr Gre (6.4) 
Cin(G@ jp Ve\Gr ve Vat AG lee Ve (6.5) 
€:pn(G™); Vi (Gj, (6.6) 


where MN P= 0)47°2)\,.s7andrarp, y=1,2,...,u. The invariants (6.6) vanish 
identically since G is symmetric. 

Using the Hamilton-Cayley theorem (5.4), each of the invariants (6.4), (6.5), 
and (6.6) may be expressed as a polynomial in the elements of the sub-set obtained 
by restricting the powers M, N, and P to be either 0, 1, or 2, independently. 


If neither M, N, nor P is zero in a given invariant of the type (6.5), we may 
use the relation 


C52 Gip Gj g Ge, = €yqr| G,;| = poy [2tr G3 — 3trGtrG@?+ (trG)3] (6.7) 


General Constitutive Equations 139 


recursively, to express that invariant as a polynomial in the invariants (6.4) 
(where N =0, 1, 2) and those of type (6.5) for which M, N, or P=0. For example 


C54 (Gi » Von (Gq VPV(G,, GV”) = =|G;;| eras Vy DYENG Ve) (6.8) 


and |G;;| may be expressed as a polynomial in trG@, trG@? and trG? by (6.7). 


To reduce the list of basic invariants further, we will establish several re- 
duction formulas. The following identity is valid if the range of the subscripts 
Is in. 3 * 


Oin%, ) O55; ? os he? Oi, ha 
I J3? Vg? i js? 15 Is 
Oinigr Sines Sigs Sig 


Expanding (6.9) by the minors of the first column, and using (6.1), we obtain 
OA (Oe eee 


Yq %g%q “7273 Ia 1 Je 


Ce alt One Con 4 (Ciasss 7h anh, Cisdepa a oe Ge AO) 
Setting 7,=1, 7,=2, and 7,=3 in (6.10), we have 
Oi i Saisie — Oirte Sninta F Oinis Sinn — Obsia Sine ts = O- (6.11) 
eee (6.11) throughout by G; 
Ciple MGV (Gi, Van 
RELY V; Ge, VAS (G2), yo) — €;5p Vo Gj5 Ve (G®),.4 yn aoe (6.42) 
— €:74 Gin vee" Giq ae (G2), 4 


Vi G5 VE (G2); g Vie”, we obtain 


Yh “de 


nn Craik 


Using the Hamilton-Cayley theorem (5.4) in the second term on the right-hand 
side of (6.12), and using (6.8) in the third term, we obtain after some simplification 


Dep Vio) (G?), ae (G2), Vv, 
= a (Gy, Gon “AA cy, Gr wi) Ciak A G3» vse) Greg Ve) =F (6.13) 
=p | Gr | a Aw Vv, Grp ye .— Craik Aas y,°) Grp eo) y 
We may thus omit the invariants (6.5) in which M, N, P is some permutation 
of 0, 2, 2, since they are expressible as polynomials in the remaining invariants. 


Multiplying (6.11) throughout by G;,;, vy) Gy Ge V,” we obtain 
Ore V7; 1G, ae (G2), Vi? at erin Vi Vy (G?),; Vo Ge. VV," 
sab os caampaie eet Crve 6! G; PATER SAY! (6.14) 
= (é V,° Lowy Vie Gro Vie Ws Keer Csi V,% ) A V2, 


nn Ciik 
where in the final step we use (6.7). 
Multiplying (6.11) throughout by G;,;, Ag V9) (@2);, ALE we obtain 
Cif V; “G, Vp (B) (G?);., BAD ne Cipk Gio Ag V8 MG, V; y) 


(6.15) 
a Vi Vi) (G2), 5 VI? = e552, iP Yi (G)n.4 VE”. 


mn Ciik 


140 A. C. Prpxin & R. S. RIVLIN: 


Using the Hamilton-Cayley theorem (5.4) in the final term, we obtain after 
some simplification 


Vi) G5 Ve? (Bag Vi” + eijn Gig VEO YG (Gg V” 


ge (6:10) 
_HG.,.G», See OPE C54 n Vie 1) Ge Vie — |Gp| enV? Ag Vy 
Multiplying (6.11) ent by 6; 4 VG, Vi”, we obtain 
€ 5» Vie 1G; UAL Naa ka + ea G; BNO (6.17) 
= =G,, n Crk Vio V; Oe avon st - Cigk BP V) (G);» sige 
Multiplying (6.11) throughout a Gj, Vir Yui? V), we obtain 
Csi n Gap Ver VO VP A 0555, VO Gin ViPV 4 2:4, VO G™ G, 5 Ve" 6.18) 
= shige Gh ya vy" V,”. 


For definiteness in applying the formulas (6.14), (6.16), (6.17), and (6.18), 
consider that .=fB=y in (6.5). The powers (M, N, P) mu aa a one of the follow- 
ing sets: (012), (120), (201), (240), (102), (024), (200), (020), (002), (011), (101), 
(110), (100), (010), (001), (000). Using (6.14) and (6.16), and relations obtained 
from these by cyclic permutation of «, 8, y, the invariants for which (MN P) 
s (012), (120),..., (021) may all be expressed in terms of the one for which 
(MN P)=(012) and other invariants. The powers (MNP) in (6.5) may thus 
be restricted to the following sets: (012), (200), (020), (002), (011), (101), (110), 
(100), (010), (001), (000). Using (6.17), we express (101) in terms of (011), (001), 
and (002). Permuting the superscripts «, 8, and y cyclically in (6.17), we obtain 
a formula expressing (110) in terms of (011), (010), and (020), and a formula 
expressing (200) in terms of (100), (110), and (101). The powers (MN P) in (6.5) 
may thus be restricted to the following sets: (012), (020), (002), (011), (100), (010), 
(001), and (000). Using (6.18) we express (100) in terms of (010), (001), and (000). 

We have shown that each of the invariants (6.4) to (6.6) may be expressed 
as a polynomial in the following invariants: 


CLR GAG GG Ge 


a? aq? 
VEPVP ln 9 Ve G, VOUS OVS Gs Ge, 
Cth Ve (G®);, lee (G*) Veg ak 
(cSpay Nhe 2) FP SOM in UN S32)\5 
The restriction on the relative values of the indices «, 8, and y is of course arbi- 
trary; any other ordering may be chosen. 
In view of (6.1), a polynomial in the invariants (6.19) which contains products 
of the invariants involving e;;, can be expressed as a polynomial which does 


not involve such products. That is, any polynomial invariant can be expressed 
in a form which is of no higher than first degree in the invariants involving e 


) 


(6.19) 


47 k° 


7. Integrity basis for the transversely isotropic transformation group 
Transversely isotropic materials are symmetric with respect to rotation about 
one axis or reflection in any plane containing that axis. If we take the symmetry 
axis to be the x3 axis, the symmetry transformations are the identity and the 


General Constitutive Equations 141 


following with their products: 


cos %, sin’?, O =A eOMetO 
—sin8, cosd, oO}, Oe te Oni (7.1) 
O, O, 1 ee 0 aa | 
The corresponding anisotropic tensors are [9] 
63; and 6,;6,; + 65;69;=«,;; (say). (7.2) 
A typical term of «;,;, By in (4.1) is of the form 
© op, pb. Xs bs Xp pw Opyr3 Opweed +++ Owe» (7.3) 


where the subscripts # form a permutation of the subscripts appearing in (4.1). 
It follows that P in (4.1) is a polynomial in the following invariants 


tr(Ga)*, a(Ga)’b, (IN se 4123 1.3 (7.4) 


where a=||«,;|| and the vectors a and b, whose components are a, and ), re- 
spectively, may take the following forms independently: 


a; = 03; Vi; 0, 

b= Owe : (AE PD Ta (7.5) 
ARNE TEA actr (Gee) = Gtach: Gay = Gyasin(say), 

tr (Ga)? = Giy + Goo + 2Gi2=GipGig (say) (7.6) 


d 
ae tr (Ga)? =8G,,Gy, Gy, — 3 (G 
(In the last equation the notation introduced in the first two is used.) Using 
the Hamilton-Cayley theorem (5.4) with G replaced by Ga, we obtain, with (7.6), 


(Ga)? =G,,(Ga)?+ 3 (Gap Gap — Gao Gye) GO. (7.7) 


Hence each power of the matrix Ga except the zero power may be expressed 
as a linear combination of the first and second powers of that matrix, with 
coefficients which are polynomials in the invariants G,, and G,, G,3;. Each of the 
invariants (7.4) may thus be expressed as a polynomial in the elements of the subset 
obtained by restricting N to the values 0, 1, and 2. These are, omitting redundant 


aot) 


elements, VOL Oa. Gai Cee Gis V8 Gop Gey Gys, 
V4) Vi, yi) Gy, Ves V4) Gs Gp, Voces (7.8) 
Coes Gup Gap: G33, Ggq Gus, G34 Gap Gps: 


Here, as in (7.6) and (7.7), repeated Greek subscripts imply summation over 
the range: 4, 2. 

The Hamilton-Cayley theorem for a 2x2 symmetric matrix with elements 
aes Cup Gpy = Gay (Gpp) + % Oxy (Ges Geo — Gee Goo) - (7.9) 
Using this, V,” (Gig Gey) G,3 and LAGss Gsy) V” may be expressed as polynomials 
in the remaining invariants (7.8), which are 

Vac VA Gy , We Cup Ges, 


V.OVE, VO Gs, Vs”, (An eh 2) sort) ss 7a 0) 


a 


G Gap Gus» Gz3, GsaGas, Ggu Gag Gps 


It is seen by inspection that the invariants (7.10) form an irreducible integrity basis. 


aa? 


142 A.C. Prexin & R. S. RIviin: 


8. Integrity bases for finite groups 

In the case of a finite sub-group of the orthogonal group, the integrity basis 
for a symmetric second-order tensor and «1 vectors may be found by a method 
similar to that previously employed [5] in the case of a symmetric second-order 
tensor and a single vector. However, for uniformity we shall here use the appro- 
priate anisotropic tensors to derive integrity bases in a manner similar to that 
employed in the previous sections. Two finite transformation groups will be 
considered — the monoclinic-domatic and rhombic-pyramidal groups. The 
integrity bases for the remaining crystal groups can be found in a somewhat 
similar manner. Further examples are derived in an earlier report [10] on which 
the present paper is based. 

(a) Monoclinic-domatic growp. Monoclinic-domatic crystals have a single 
plane of symmetry. If we take the x, axis of a rectangular Cartesian reference 
system «x normal to this plane, the group {S} of symmetry transformations 
consists of the identity and the transformation with diagonal matrix (1, 1, — 1). 
The corresponding anisotropic tensors have the basis [9] 6,;, 02; and 63; 63;=£;; 
(say). Substituting outer products of these for Oi By in (4.1) and tabulating 
the possible inner products which can thus arise, we find that polynomials in 
G and V™ invariant under the monoclinic-domatic transformation group must 
be expressible as polynomials in the following invariants: 


ir(G@ B)", SalGeBy G ON (Re. 4 (8.1) 


where B=||8;,|| and the vectors a and b have components a; and 0; respectively, 
which may take the following forms: 


a= 61, das, Ene (2 


= B 
b= G5 Gia 9 VX 1, 2, -.-, ft). (8.2) 


Using the substitution properties of the Kronecker delta, we find that the 
ij-component of the matrix (G@B)* may be expressed in the following way: 
(Gir, O3n, 3;,) (G, x, 03%, O3;,) --» (Giy_y ky O3 ay O8;) 
= (G;303;,) (G,3 93;,) --- (Giy_13 Os,) 
— Gis (03), Gj, 3) (03 j, Gig) «.- (03 jy Ga 3) 0s j 
= (Gay) Gus ny) a (Nad) 


(8.3) 


Using (8.3), we express each invariant (8.1) with N=1 as a power of G,, times the 
invariant obtained by taking N=1. The complete list, omitting obvious re- 
dundancies, is as follows: 


Gy1, Gyo, Gs3, Gi», (Go3)*, 
Vi, Vi, Vio) VP), Vio Gs e A Gay (8.4) 
(oy BA, Os wee 


. By inspection, no one of the invariants (8.4) can be expressed as a polynomial 
in the others, hence this integrity basis is irreducible. 


General Constitutive Equations 143 


(b) Rhombic-pyramidal group. Rhombic-pyramidal crystals have two ortho- 
gonal planes of symmetry. If we take the x, axis paralled to the intersection 
of these two planes and the x, and x, axes in the planes of symmetry, the trans- 
formation matrices describing the group of symmetries are diagonal matrices 
with components (1, 1,1), (14, —1,1) (4,4, —1), and their products. The basic 
anisotropic tensors of this group are [9] 6;1, 0;2 6;.=£{} (say), and 6;3 6;;=6" 
(say). 

Substituting outer products of these tensors for « 


ij,..np, M (4.1) and forming 


all the inner products which may thus arise, we obtain ‘ 
te een! De Nj Oke?) (8.5) 
where P™) — || P|], 
N ay iL iL) 
Pe ty Gij, By Gi, Bf? 30h Can Be, (8 6) 
and 
Oa O19 VAs ), V3 se (8 7) 
b= Vio Gi, 


and jt;, fiz, -.., ly take the values 2, 3 independently. 


The invariants of the type trP® are G,, and G33. The invariants of the 
type tr P®) are G3., G33 and G};. From the substitution properties of the tensors 
ay and B®, the product Pi (N= 1) may be expressed as the product of G;; 05; 
(Z, ¥=2, 3) and powers of Gy, G3, and G23. In G;z6;;, #9 only if P|) involves 
Gy, to an odd power. Factoring out the invariants G2, G33, and G33, we find 
that the remaining factor of PY) (V=1) must be one of the following: 


Gj 02;; G;3 03;, G; > Gy 03;, G;3 Gyo 0o;- (8.8) 


Substituting these and 6,; for P™) in (8.5), and substituting the vectors (8.7) 
there also, the possible combinations which can be formed are, omitting redundant 
elements, the following: 


Gi; Goo, G33, G33, G1, Gia. G12 Ge3 Gz, 
Vi, Vm Ve), Ve Vz), VA Go Vy), (8.9) 
CAV ORE CVO MEG GY Vie Ie GH Vat. 


Since each of the invariants (8.5) is a product of these, the invariants (8.9) form 
an integrity basis. By inspection, it is seen that this integrity basis is irreducible. 


9. The constitutive equations for the symmetry groups considered 

We have seen in §3 that if, for a material possessing symmetry, the consti- 

tutive equation is assumed to have the form (1.2), in which f;,;, ..;, are polynomials 
in the arguments, then it must be expressible in the form 

Dry wa eg oH J ; 

a OX;, OX}, OX jy zal over) es aver 


REL, AEG wet’ TpGad), 


U 


144 A.C. Prpxin & R. S. Riviin: General Constitutive Equations 


I,, I, ..., Ty are the elements of an integrity basis for the second-order symmetric 
tensor G,, and the vectors Ve (x=1, 2,...,¥) under the group of transformations 
characterising the symmetry of the material. J, (A =1, 2,..., M) are the products, 
multilinear in the vectors V;” (2=v+1,7+2,...,v+m), which can be formed 
from the elements of an integrity basis for the second-order tensor G,, and the 
vectors V,) (x=1,2,...,¥-+ 4) which involve the vectors Vi (n=v+1, "+2, ..., 
y+). 
If w=1, the constitutive equation (9.1) takes the form 


ees M —OS4 Pay et E2sGae (9.2) 
Oe a me hi Ya Fic 
where J, (A=1,2,...,M) are now the elements, linear in the vector Vie 
of an integrity basis for invariants of the tensor G,, and the y+ 1 vectors Ve 
(== 45 25) cet 1). 
If the material considered is isotropic and possesses a center of symmetry, 
we see from (5.5) that equation (9.2) takes the form 


é 1 < ea (oa a a a re 
i ax, (Fi nA 4 Fi 1G, ,V§ 4 Fi Gp Gn_ VE )) (9.3) 
c7—e b 


where F\, FR and F are polynomials in G~*, tr G@” (M=1, 2,3) and 
VO GVO Bical s 2a Ps Vie On teed) 

The results given in §§6 to 8 may be used in an analogous fashion to obtain 
the corresponding constitutive equations for isotropic materials which do not 
possess a center of symmetry, for materials possessing transverse isotropy and 
for materials possessing monoclinic-domatic or rhombic-pyramidal symmetry. 
Details of these cases and further results for other symmetries and for the cases 
when “= 2, have been given in [10]. 


Acknowledgment. The results presented in this paper were obtained in the course 
of research sponsored by the Office of Ordnance Research, U.S.Army, under Contract 
No. DA-19-020-ORD-4531. 


References 


{1} Voter, W.: Lehrbuch der Kristallphysik. Leipzig 1910. 

[2] Nvz, Jj. F.: Physical Properties of Crystals. Oxford 1957. 

3] Smitu, G. F., & R. S. Riviin: Trans. Amer. Math. Soc. 88, 175 (1958). 

[4] Tourin, R.: J. Rat. Mech. & Anal. 5, 849 (1956). 

[5] Smiru, G. F., & R.S. Riviin: Brown Univ. Reports Nos. DA-3487/3 (1955), 
DA-3487/4 (1956), DA-3487/7 (1956), DA-3487/8 (1956). 

[6] Ertcxsen, J. L., & R.S. Riviin: J. Rat. Mech. & Anal. 3, 281 (1954). 

[7] Smiru, G. F., & R.S, Rivurn: Arch, Rat. Mech. & Anal. 1, 107 (1957) 

.8| Weyt, H.: The Classical Groups. Princeton 1946. 

[9] Smit, G. F., & R. S. Rrviin: Quart. Appl. Math. 15, 308 (1957). 

[10] Prexin, A.C., & R. S. Rivirn: Brown Univ. Report No. DA-4531/4 (1958). 


Brown University 
Providence, Rhode Island 


(Received June 29, 1959) 


Herleitung der Plattentheorie 


aus der dreidimensionalen Elastizitatstheorte” 


DIETRICH MORGENSTERN 


Vorgelegt von ELI STERNBERG 


I. Fragestellung 


Die dreidimensionale lineare Elastizitatstheorie ist eine in sich abgeschlossene, 
mathematisch befriedigende Theorie mit Existenz- und Eindeutigkeitssatzen fiir 
die einschlagigen Randwertaufgaben, die fiir die meisten in der Technik benutzten 
Bauwerke ausreichende Ubereinstimmung mit den Beobachtungen gibt. Neben 
ihr verwendet man in der technischen Mechanik spezielle Theorien fiir die haufig 
vorkommenden Bauelemente Balken und Platten. Die fiir den Balken auf 
BERNOULLI zuriickgehende Theorie wurde von KirCHHOFF! auf die Platte tiber- 
tragen. Beide stellen wegen der Schwierigkeit der numerischen Lésung der drei- 
dimensionalen Gleichungssysteme auBerst niitzliche Theorien dar, deren Her- 
leitung aus der dreidimensionalen Theorie durch gewisse plausibel erscheinende 
zusatzliche Hypothesen? erfolgt, die wegen der eindeutigen Existenz fiir die drei- 
dimensionale Theorie nur entweder falsch oder iiberfliissig sein kénnen. Es ist 
das Ziel dieser Untersuchung, zu zeigen, wie sich allein aus der dreidimensionalen 
Theorie beim Ubergang verschwindender Plattendicke die Gleichungen der 
Kirchhoffschen Theorie ergeben. Die dreidimensionale Theorie wird angewendet 
auf einen zylindrischen Bereich von der Hohe h (| %3| Sh/2), wobei die Flachen 
%3—=h/2 spannungsfrei sein sollen, wahrend fiir den Zylindermantel geeignete 
Randbedingungen formuliert werden; die Belastung wird als nur von %,, %» 
abhangige Volumkraft in Richtung der x,-Achse angesetzt. Die sich ergebende 
Mittelkonvergenz laBt erkennen, daB die sonst benutzte ,,Bernoullische Hypo- 
these“. bei abnehmender Plattendicke mehr und mehr gilt. Beim Vergleich mit 
fritheren Betrachtungen dhnlicher Zielsetzung (,, Theorien dicker Platten‘‘)* muB 
festgestellt werden, daB diese vom mathematischen Standpunkt aus gesehen nur 
Plausiblitatswert besitzen, da an irgendeiner Stelle das ,,St. Venantsche Prinzip“ 
in einer noch nicht bewiesenen Weise verwendet wird. 


* Die Arbeit entstand an der Technischen Universitat Berlin. 

1 KircHHOFF, G.: Crelle J. f. Math. 40, 54 (1850) = Ges. Abhdl. p. 237 (Lpz. 1862). 

2 Zum Beispiel Szané, I.: Hohere technische Mechanik, 2. Aufl., p.167—174. 
Berlin-Géttingen-Heidelberg: Springer 1958. 

3 Kin Kurzrefereat iiber den Inhalt wurde vom Verf. auf der GAMM-Tagung am 
22. 5. 1959 in Hannover gehalten; vgl. Z. angew. Math. u. Mech. 39 (1959). 

4 Siehe z.B. GECKELER, J. W.: Elastostatik, Kap. VII. Die Biegung der dicken 
Platte (Handbuch der Physik, Bd. VI, Mechanik der elastischen Korper, p. 210—230. 
Berlin: Springer 1928). 

Arch. Rational Mech. Anal., Vol. 4 10 


146 D1ETRICH MORGENSTERN: 


II. Beschreibung der Methode 
Das Beweisverfahren griindet sich auf die Minimalsatze der Elastizitats- 
theorie. Im Gegensatz zu einer friiheren Untersuchung zur Begriindung der 
zweidimensionalen Elastizitatstheorie® werden hier beide Prinzipien verwendet: 
das Prinzip vom Minimum der Formanderungsarbeit und das Prinzip vom 
Minimum der Spannungsarbeit (auch Prinzip vom Minimum der Erganzungs- 
arbeit oder nach CASTIGLIANO genannt). Das erste Prinzip hat die Gestalt 


(2.4) A {u} =4Q(u, vu) —l(u) = Minimum!, 


wobei Q eine positiv-definite quadratische Form, / eine Linearform ist, wahrend 
die zugelassenen u-Werte gewissen linearen Bedingungen gentigen miissen: 
,geometrisch zulassige w-Werte‘. Die zugehérigen ,,Eulerschen Gleichungen” 
fiir die Lésung “,) des Problems lauten 


(2.2) Q (uv, U%) — 1 (uo) = 0 
fiir alle w, die den homogenen Bedingungen geniigen. Sie ergeben die Beziehung 
(2.3) A {uj — A {uo} = 3 O(U — Mo, U— Me), 


so daB aus ,,A nahe dem Minimalwert“ folgt, daB in der Q-Metrik uw nahe der 
Lésung “, ist. Fir die hier erforderliche Abschatzung von A(u) —A(u) wird 
das zweite Minimalprinzip herangezogen, welches von gleichem Aufbau ist: 


(2.4) B (oc) =3 P(c,o) — k(o) = Minimum!, 


wobei die eingesetzten Spannungsfelder o gewissen linearen Bedingungen geniigen 
miissen: ,,statisch zulassige Spannungsfelder“‘. Fiir die Minimalwerte gilt 


(2.5) Min A = — Min B 
und deshalb ist 
(2.6) 3 O(u — Uy, U — Uy) + 3 P(o — og, 6 — Oy) =A(u) + Bo), 


so daB aus jedem Paar zulassiger u und o, die eine kleine Summe rechts ergeben, 
auf die Approximation sowohl der #) durch w als auch der og durch die o ge- 
schlossen werden kann. Die Abschatzung der rechten Seite von (2.6) erfolgt 
durch Vergleich beider Summanden mit den entsprechenden Werten der Extremal- 
prinzipien fiir die Plattentheorie. 


III. Die Extremalprinzipien 
Fiir die dreidimensionale Elastizitaétstheorie bestehen bekanntlich folgende 
Extremalprinzipien: 
Prinzip vom Minimum der Formanderungsarbeit. Fiir w=w;, die den etwa 


vorgeschriebenen Randbedingungen u;=wu* geniigen, und die zugehorigen Ver- 

zerrungen ® 

mn Ej, = 2 (Wire + Uy;) 

° MorGENsTERN, D.: Mathematische Begriindung der Scheibentheorie (zwei- 

dimensionale Elastizitatstheorie). Arch. Rational Mech. Anal. 3, 91—96 (1959). 
° Die hinter einen senkrechten Strich gesetzten Indizes bedeuten partielle Ab- 

leitungen nach den entsprechend numerierten Variablen. 


Herleitung der Plattentheorie 147 


ist (H3 ist ein fiir spater eingefiihrter Normierungsfaktor) ” 
- 1 2 

AW =6 [ff |X a+ amo (Da) 

Tie Wi gidm dx dx,— ff > u, f,dw 


am kleinsten fiir die Lésung. Dabei wird im Oberflachenintegral iiber denjenigen 
Teil der Oberflache integriert, auf dem die Randkraftdichtenkomponenten /,; 
vorgeschrieben sind; g; sind die gegebenen Volumenkrafte. 


Prinzip vom Minimum der Spannungsarbeit. Fiir statisch zulassige o Felder, 
d.h. solche, die den Bedingungen 


Ax,d%,d x3 + 


Ds Orr, = 81 
Oo) a 


>» 6;,%, =f; an demjenigen Teil der Oberflache®’, wo /; gegeben ist, 
k 


geniigen, ist 


(3.3) A? B(o) = ae Sf 2,4 ry (Sou) 


am kleinsten fiir die Lésung. Dabei wird im Oberflachenintegral tiber denjenigen 
Teil der Oberflache integriert, bei dem /; vorgeschrieben ist. AuBerdem gilt 


(3.4) Min A = — Min B. 


dx, dx, dx,—[[ Yiut Oj, 0%, dw 


i,k 


Fiir die Plattentheorie werden im allgemeinen die beiden analogen Minimalsatze 
nicht formuliert. Damit keine Grenzpunkte zwischen den verschiedenen Teilen 
des Randes auftreten, auf denen verschiedenartige Randbedingungen gestellt 
werden, wird angenommen, daB eine der folgenden Randbedingungen auf dem 
ganzen Rande gilt. Mit der Plattensteifigkeit 

1 B 

12 1297 


gilt dann: 


Prinzip vom Minimum der Formanderungsarbeit. Fiir geometrisch zu- 
lassige w, d.h. die den Bedingungen 


(a) w=w* und — wy, am Rande vorgeschrieben, 
(3,5)  (b) w=w* am Rande vorgeschrieben, 
(c) keine Bedingungen fiir w, 


geniigen, gilt: 
N 
hs Ay {w} = z ffi w)® — 2(1 — ») (3122 — Wir 2) | dX dX» + 


= [[P2dmdxy— | Bs) w(s)ds + f mis) wW,(s) ds 


? Zwischen den elastischen Konstanten G (Gleitmodul) 20, E (Elastizitatsmodul) 


(3.6) 


und m (Poissonsche Querkontraktionszahl) = 1 >> besteht der Zusammenhang 
2(m+1) G=mE. ue 

8 m; bzw. t; bezeichnen die Komponenten des Normalen bzw. Tangenteneinheits- 
vektors. 


10* 


448 DiETRICH MORGENSTERN: 


ist am kleinsten fiir die Lésung; dann ist w die Durchbiegung der Plattentheorie, 
die der Gleichung AAw=}/N geniigt; p ist die gegebene Plattenlast. 


Prinzip vom Minimum der Spannungsarbeit. Fiir statisch zulassige M;, 
und Q; (‘=1, 2), d.h. den Bedingungen 


(3:7) 2M, tk|k =, Ze Qiti a p 
sowie den {at enalient 
» M,,”;2 =m(s)  (gegebenes Randmoment) im Fall (b) und (c) 


oe : imi + ale Mi, 0; i,) = K(s)  (gegebene Randkraft) § im Fall (c) 


saaeecan ist 


2 2 \2 
3 = AL janes) Mi 
8 By(M) =— {fle Lp as 4 i) [4x d x2 
(3.9) a 
+ [ oD Ma nim ds =) wa(> 0; n; +4 | M,,m;t,)) ds 
(a) ge (a, 6) 

am kleinsten fiir die Lésung, fiir die 
(3.40) —W Mn = (1-9) Whiz +¥ O;n (Dd C1) 
besteht. Es gilt hier analog 
(3.14) Min A, = — Min Bp. 


Im folgenden sollen der Einfachheit wegen alle gegebenen Randwerte =0 gesetzt 
werden. 

Aus Normierungsgriinden soll hier =h'$)(x,, %2) gesetzt werden. Die Sub- 
stitution N=/?N, laBt dann die wesentliche Unabhangigkeit von A, und B, 
von # erkennen, wenn man auch M;, und Q; mit entsprechenden Normierungs- 
faktoren versehen denkt. Der Beweis besteht nun, wie in II beschrieben, im 
wesentlichen in der Angabe geeigneter zulassiger Vergleichssysteme fiir die drei- 
dimensionalen Minimalprinzipien, bei denen 


X>5 


(3.42) Se NO —— o 
und am Zylindermantel 

(a) “4 =U, = ug = 0; keine Bedingung fiir o;,, 
(3.13) (b) #5 =0; 24% = 0 (2 = 1, 2), 

(c) keine Bedingung fiir w; und >}o;,%,=0 (¢=1,2, 3) 
zu verlangen ist. : 


IV. Aufstellung geometrisch-zuldssiger Vergleichsfunktionen 
Hier wird der naheliegende Ansatz 


(4.1) Sea ep  NA 


Uz = w+ «3 W(x, XQ) 


Herleitung der Plattentheorie 149 


mit den zugehérigen VerzerrungsgréBen 


Ej, = — X%g Wy (hk =—1, 2) 
(4.2) E35 = 3 43W); 
E33 = 2x%,W 


gemacht, wobei w die Lésung der Plattengleichung sei. Einsetzen dieses ,,Ritz- 
schen Ansatzes‘‘ mit der noch willkiirlichen Funktion W in den Ausdruck (3.1) 
fiir A ergibt 


(gic che ages 
A = 45 | Det -4we+—* (Aw + WY) dadny+ 

+ [[epodndnt = ff (grad W)2 dx, d xp. 
Wegen der Identitat fs 


2 1 a re m—1 Zo Ws 
Wate eg at WE | 
bleibt 


(4.3) 


abe seat (4)? — 2 (11 Waa — Wha) + 


eye m—1 Aw \2 | 
(4.5) + (dwt _4 mat (2w + 32) |axdx, a 
h2 
+ ff epodmdn+ i ff (grad W)2.d x, d x, 
und, da (EEE ist, 
m—1 6 

we h? m—1 Aw \2 

(4.6) A=4,+ 5 ff (grad W)?dxydx, + 7% [f (2w + aS) dx, dX». 


Die weitere Festlegung von W erfolgt im Falle (c) durch 


dik tf4 Aw 
(4.7) ue 2° m—1 
und 1aBt 
(4.8) lim A 4, 
erkennen. 


Im Falle (b) oder (a) erfordert das Erfiillen der zulassigen Bedingungen (3.13) 
fiir die w;, daB am Rande W=0 ist. In diesem Fall ist durch eine stetig diffe- 
renzierbare Abanderung der Funktion W in einem kleinen Randstreifen offenbar 
méglich, das letzte Glied in (4.6) kleiner als ein vorgeschriebenes positives 6 zu 
machen, so daB wiederum 


(4.9) lim A SA)+0, 
wegen der Beliebigkeit von 6 also 
lim AA, 
—>0 


erkannt ist ¥. 


9 Eine genauere Uberlegung laBt sogar 


lim ee <o 


erkennen, entsprechend fiir B. 


150 DIETRICH MORGENSTERN: 


V. Aufstellung statisch zulassiger Vergleichssysteme 
Im Falle (a) ist der Ansatz 


(5.1) Oj, = — (6% — 3) Q; (4 =1,2) 
O33 = (2%3 — 3 Xgh*) p 
mit M;,, Q; Losungen der Plattengleichung, méglich, denn auBer den Gleichungen 


(3.2) und den Bedingungen der Spannungsfreiheit an den Flachen x,= +h/2 
sind keine Bedingungen zu erfiillen! Einsetzen in den Ausdruck (3.3) ergibt 


(5.2) B=B,+G?+C,h, 
woraus dann sofort 

(5.3) lim B= By 

folgt ®. 


Die Angabe von zulassigen Vergleichssystemen fiir die Randbedingungen (b) 
oder (c) erfordert besondere Miihe!; ausgehend von den Lésungen M,, der 
Plattengleichung soll durch Addition von Funktionen N,, in dem Ansatz (<1) 
erreicht werden, daB damit alle Randbedingungen erfiillt werden. Das erfordert 


(5.4) Ninn = Re, D Ry; = 0 
und im Fall (b) 
(5.5) Dig te = — UM my = 4;, 
im Fall (c) auBerdem noch 
(5.6) DL Ri4=— D0 =0- 
Dabei bestehen fiir den Fall (b) die Randbedingungen 
(5.7) 2M, 0, =D a;n;=0, 
i,k 

so daB man setzen kann 
(5.8) a;=t;a(s), 
wahrend im Fall (c) zusatzlich gilt 
(5.9) Dim; + (DM, nt) =0, 
also 
(5.10) by Mh re ee a 

ds pia ds bas, 


Die Gleichungen (5.4) werden durch den Ansatz 
(5.11) Mi =Y.2, M2 =— F(Yj1 + Pp), Noe = Yn 


Diese Schwierigkeit tritt bei dem entsprechenden Problem des brettformigen 
(unendlich breiten) Balkens, d.h. dem Ubergang von einer zweidimensionalen Elasti- 
zitatstheorie zur Balkentheorie, nicht auf, da dort das den hier auftretenden Mix 
entsprechende M durch die Differentialgleichung bis auf endlich viele Konstanten 


eindeutig bestimmt ist (statisch bestimmter Fall) 


Herleitung der Piattentheorie 154 


betriedigt™. Es wird sich zeigen, daB man am Rande y=y=O vorschreiben 
kann. Es wird zunachst y, und yp, bestimmt. Da y,=y,=0 gilt, bleibt am 
Rand 


5 Nit = On 
(5.12) Ny 2 = — F (Pn M1 + Yn Me) 
Np» = Yn be 


und die zu befriedigenden Gleichungen erhalten die Form 


ZN i a He =F (Pn — Yo Ma) Mh = — Oty 


(5.13) 


NA > (Pn M1 — Yn M2) My = 4M, 
d.h. die beiden Gleichungen sind aquivalent der einen Gleichung 


(5.14) Pn M1 — Wn Mg = — a(S). 


Die im Fall (c) zusatzlich auftretende Gleichung (5.6) ist wiederum dieser Glei- 
chung aquivalent. Denn mit den gemachten Annahmen wird nach kurzer Zwi- 
schenrechnung 


c Ae 1 4 
(5.15) I= As (; Pe) = > ae (Pn — Yn Ma) - 


Indem man nun auf der inneren Normalen (Koordinate € vom Rande her ge- 
rechnet) 


(5.16) O(ss) = pn(s) &(1 — =) fur ¢= 0; sonst 7 =0 


setzt, und analog fiir y, erhalt man ein System von Funktionen N,,, welches allen 
Randbedingungen geniigt, andrerseits aber, als additiver Bestandteil zu M;,, Q; 
n (5.1) verwendet, den Ausdruck B nur beliebig wenig vergréBert. Damit ist 
dann auch die Konvergenz (5.3) in diesen Fallen gezeigt. 


VI. Die Konvergenzaussage 
Nach den friiher in § II angestellten allgemeinen Uberlegungen gilt dann 


(6.1) lim J aff ec, — Cd x, d x» d x =0, 


wenn C‘) die wahre Lésung des dreidimensionalen Problems fiir den Zylinder 
der geringen Hohe / beschreibt, und ¢;, den angesetzten Naherungsausdruck (4.2) 
kennzeichnet. Ebenso gilt 


(6.2) lim J, aff [o,, — td xd xd x5 = 0, 


wenn 1) die wahren Spannungen des dreidimensionalen Problems, o,, unsere 
Naherungsausdriicke (5.1) beschreibt. Die ,,Bernoulli-Hypothese” fiir die Platten- 
theorie ist damit im Limes bei abnehmender Plattendicke bewiesen. 


11JTm Fall einfach zusammenhangenden Grundbereichs ist dies die allgemeine 
Lésung der Gln. (5.4). 


152 DrerricH MORGENSTERN: Herleitung der Plattentheorie 


Durch im Prinzip einfache, aber mithevolle Rechnungen 1aBt sich aus der 
Konvergenz der Verzerrungen 6") auch eine ahnliche Mittelkonvergenz der zuge- 
hérigen Verschiebungen gewinnen. Auch die entsprechenden eleganten Uber- 
legungen von Herrn ANDRE? lassen sich fiir diesen Zweck hier ohne Schwierig- 
keiten iibertragen und lassen die Konvergenz der Mittelwerte von u, gegen die 
Funktion w erkennen. 

Ein vorlaufig ungeléstes Problem ist es, auch die punktweise Konvergenz 
nachzuweisen. : 

Eine andere sich hier anschlieBende Frage ware die Ubertragung dieser Unter- 
suchungen auf Schwingungsprobleme. 


12 ANDRE, K.: Mathematische Begriindung:der Theorie des Balkens. Dissertation. 
Fak. f. allg. Ing.-Wiss. Techn. Universitat Berlin 1959 (eingereicht am 23. 6. 1959). 


Institut fiir Math. Statistik 
Miinster in Westfalen 


(Eingegangen am 18. Juli 1959) 


On the Convergence 
of the Rayleigh Quotient Iteration for the Computation 
of Characteristic Roots and Vectors. VI 


(Usual Rayleigh Quotient for Nonlinear Elementary Divisors ) 


A. M. OSTROWSKI 


73. In what follows we shall apply the classical Rayleigh Quotient Iteration, 
as described in Part I by the formulae (4) and (5), to a non-Hermitian matrix A 
in the case of a nonlinear elementary divisor. The procedure amounts to forming 
successively, starting with A, and using a constant column vector 7, the sequence 
A, defined by 


(5) §&,=(A—4/,1)79, 


(4) Ayia = Ee Gr Dalit 

Again this amounts to /,,,=q@(A,), 

1 n*(A*—AI)1A(A—A La 
Ah) ASA a 


(270) y (A) 


Let o be an eigenvalue of A to which correspond non-linear elementary 
divisors, and let L be the maximum exponent of an elementary divisor corres- 
ponding to o. Then we shall prove that unless 7 in (5) satisfies a certain algebraic 
relation, 


(271) g(a) -A=(A—9)" (y+ O(A—o)) (A) 


with y=+0, y==co. Hence it follows that this procedure, if used directly, con- 
verges, if at all, only very slowly. Indeed, it can be shown that in this case, 
if, for instance, everything is real, and the sequence A, formed by 4,.,=9(A,) 
tends to o, we have 

ul 
yL-1| 1, —o| > : i (v —> oo) .* 


|(ZL—1) y|4? 


This paper was sponsored by the Office of Naval Research and was prepared 
in part under a National Bureau of Standards contract with The American University, 
Washington, D.C., and in part at Stanford University, California. I am indebted 
to Dr. E. V. HAyNswortH and Betty JANE STONE for discussions. 

* OstRowSKI, A.: Sur la convergence et l’estimation des erreurs dans quelques 
procédés de résolution des équations numériques. Collection of papers in memory 
of D. A. GRAVE, 1940. 


154 A.M. OstrRowskI: 


However, the accelerating procedure of STEFFENSEN is applicable in this case 
if we form 

” A)) — (A)? 
(272) P(A) = p(y (A)) — 9A) 


Then it can be proved that 

(273) Oa) A cF 

so that o is a point of attraction, and the convergence of the iteration 
Aa = B(A,) 


is linear. (273) corresponds to the result established in a previous paper*. 
However, we shall have to deduce this formula again, since the hypotheses 
used in the paper cited are not satisfied in this case. 

Starting from (272), quadratically convergent iterations can be formed just as 
in Sections 49 and 50 of Part IV. Again it turns out, as has been observed in the 
case of linear elementary divisors, that the use of the classical Rayleigh iteration 
method presents a distinct advantage as compared with the generalized Rayleigh 
iteration. These results are given in detail in Sections 81 and 82. 


74. From (270) it follows immediately that 


Se n*(A*—2 I) 9 
(274) p(4) — 4 n*(A*—A I) (A—AD— 7 


Assume that A, reduced by a linear transformation to Jordan canonical form, 
becomes 


et k 
(275) SAIS 1 AG a A Cyl ig ea ca ee Ok 
“x=1 


Here I,,,, and U,,, denote, respectively, the unit matrix and the auxiliary 
unit matrix of order m,, and U,=0. We assume further that c=o,= ---=o7 
while all o,, «> 7, are +o. Further, we assume that 
(276) Lem = => May 2 My. 


te Oi — Wethavestors7—={1 2) 
(Ba AL GA) Don ale Cra) te a eee ge 
and, since U0 ioe AW eH sy IP with Ao, 
(277) ie ee —Sertuz, may OO, nO Gee 
»=0 
On the other hand, obviously by (275) 


ee (4 — AN = 35 (A,— Aly) 


* Ostrowsk!, A.: Uber Verfahren von STEFFENSEN und HousEHOLDER zur 
Konvergenzverbesserung von Iterationen. ZAMP 7, 218—229 (1956). 


The Rayleigh Quotient. VI 155 


and we have from (277) and (277°) for A>, since (AL =A1,,.) 7 (4e=2) remain 
bounded, 


“a ih 
(As AS DUE” + Oe 4), 


or, putting 
(278) 3: One OL NG 
where O is a zero-matrix of the order >) m,, 
ae Th 
(279) (4 —A1)*=¢'N,+0(e'). 
75. Using (275), we have from (279) that 
(280) (4 —ADN7A=S4A —ADAS =o SAN, S+0(p!-, 
and therefore, putting 
(281) Sat Ss iy, 
we see that 
(282) (4 —AlA=o'N+0(!-. 
From (282) it follows that 
(283) (A*52/7)2 =o" N*+0(0'5). 


Introducing this into (274), we then obtain 
eA Ole er 


m| a =Eet 
yA) (o0)£ n* N* Ny -+O(g2L—2) n* N* Ny + O(@ Ne 
or 
* N* 
(284) p(A)—-Aa=(A 0) NENG + O([A—o]£*1) = (A—0)- y Ol (A—o)- 4], 


assuming 7 to be chosen in such a way that 
(285) nHON* 7 = 0; HEN TN) S10 


Since N +0, N* N is a positive semi-definite matrix and the second condition 
of (285) can be satisfied. As to the first condition of (285), we have still to show 
that it can be satisfied, that is to say, that N* is not a skew-Hermitian 
matrix. But if N* = N’, and therefore N, were a skew-Hermitian matrix, as 
D=S*1S71 is a positive Hermitian matrix, by (281) we should have 


STN St eo NG NiStS 422 Set SAN, 
(286) B=DN,=—N,D=-— B*, B= (8;,). 
To prove that (286) does not hold, compute from (286) 0; and 6;,. If the 
elements of the first row of D are denoted by d,,,...,a,,, we obtain },,; by 
multiplying (d,,,...,4,,) by the L-th column of N,, which has in the first 


place 1 and otherwise 0. Thus we obtain 6;;=d,,. As to 0,,, it is obtained 
by multiplying the L-th row of D by the first column of N,, and this consists 


156 A.M. Ostrowsk!1: 


only of zeros. Therefore we have b; ;=0, and since, by (286) b,,=—2y,; “iHe 
However, this is impossible, since in a positive Hermitian matrix no diagonal 
elements vanish. We see that (286) is impossible, and the condition y ==0 reduces 
to a condition 7*K7=:0, where K is not skew-hermitian. This proves (271). 


76. We discuss now the expression ®(A) given by (272), and for that purpose 
we write (271) in the form 


(287) (4) =4+ (A—o)" E(A), 


where E(A) tends to y with A-+c. We shall prove in Sections 76—79 that E(A) 
satisfies a suitably specialized “‘Lipschitz condition” with respect to A in the 
neighborhood of A=o. 


We put 
(288) 9 == SS) RSA a RS ae a1 


? 
4;— 6 


and assume that A, and A, tend to o in such a way that 
Ag— oO 


Ag— 
(289) ae —>1, <3 


7 =1+0(4—0). 


For any matrix T we shall denote by [7] the expression 7* 77. We have from 
(287), (288) and (274) 


E: — P(A) — As (A)—-A, _ Z 
(290) E (ja) — E(h) =" ot — “Gan = 
(291) Z = (Ay — 0)" [RE] [RF Ry] — (da — 0)* [RY] [Re Re, 
(292) T = (A, — 6)" (A, — 0) [RY Ry] [Re Re]. 


From (282) and (283) we have 
[RF Ri] = (0; 0)" N*N]+O0(ef7"?) = (4 = 1, 2), 
and therefore by (292), (288), and (289) 
T = (A, — 0)" (Ap — 6)” oF OF 03 O2 [N* N]? (1+0(4,—0)), 
or, if we put 
(293) a= (NEM ia SNS), 
(294) T = agi" (14+ 0(A,—9)). 


77. As to Z, it can be written as the sum Z=Z,+Z,4-Z,, 


2, = ((4, — 0)’ — (a, — 0)") [RE] [Rt Ry], 
Z, = (dy — 0)” [RF — RY] [RF Ry], 
23 = (dy — 0)” [RY] (RT Ry] — [RF Ry). 


From the development (277) we have by (277°) and (280) 
7 


I 


(A pad) =D 0 Mo isa 


= 


_ 


The Rayleigh Quotient. VI gy 


where AK; =WN and the K, (0<»<L) are suitably chosen constant matrices, while 
Ky=(A) has rational functions of 4 as elements and remains bounded as 
4c. Then by (288) and (289) 


—l 


(295) R= N+ Dak, +Ko(A;))  (¢=14, 2), 


[Ro — RY] = (er — of) [N*] + Ss (0 — oi [KF] + 0(A,— 4) 


= (22 — e) {LG [N*] (1+ O(4 — 9) +3 Og )| +0 (4, — 4), 
(296) [RE — RE] = (A, 4a) (L bar** +0 @)I. 

Further, we have by (295) and (293) 

(296°) [R¥] = ba + O(e). 


78. On the other hand, from (295) we have 


cL E a 
RFR, = (9,0) N"N + K, Kyo; + 2 KG K,o;+ 2 Ki Kyeft- Ko Ko, 
B,v=1 y=1 #=1 
where, as also in the following formulae, >’ signifies that the term with w=v=L 
is missing. If we now put 


et helt Mi (esis Diy = ES ky OL), 


[Ke KJ =01,(4)0 USeS0), [KP K)=M,.4) Usps 


Kon = Moo(Ai), [Ko Ko] = M (A) Moo( els 


where My, are rational functions of 4;, and M,,9, Moo are rational functions 


of A,, then My,, M,,9, Moo remain bounded for 2;>o. We obtain 


[Ri R;] = (0;0 0;)" wt 5} M yo; b+ Y Monld Jett SS Myol A;) OF 
(297) “,v=1 % v=1 =i | 
+ Mo(A;) Moo(A,)- 
If we now put 
¥ =a(qi — oy Jar +B! Mule eo — @5) ot 
Ly? 


Y, =a (0f — 05) ore M,,5(0f — 0%) 05, 


u,v=1 


Yio= ym, (A 1) (0; — @) + 2 Mol ) (Of — 0%) + Moo(Ar) (Mo 0 (Ar) — Moo (As), 
Yao = B (Mo, (h) — Mor(da)) 06-+ 3 (Molt) — Muo(%s)) 86 + 


Myo(A) = Mo (As)) Mo(A2); 


158 A. M, OstROWSKI: 


we have from (297) 
[Rf R,] — [Ry Ry] =¥,+ Y¥,+ Yo No, 
and by (289) 2 
Y, = (01 — @2) |L a@r or * (1 +0 (A, — 9) 4 aM oat) 
= (dy — Ay) L aor of 7? (1+ 0(4, — 0), 
Y= Uae A) ara of (1 pags o)), 


Yio 


I 


(01 — @2) s OG) Su y og) + O(A, — Aa), 
O((A Shey, 
O ((A, As) of), 


Yio 


Yoo 


(298) [RY R,] — [RF RJ =L a (0 01)" { (Ao Ay) Q1 + (Az — Ay) @1 +O (Ap 
79. From (289) we have 


£-1 


(A, — o)” = (A, — 0)” = (A, — A) 2 (a — a)" (A, — o)*-1-# 
= (A, — Ay) L(A, — o)** (1+0 (A, — 9)) 

and by definition of Z,, using (296°) and (297) 

Zy = (Ay — Ag) L (Ay — 6)*~ bor (1+ 0 (A — 2) a(01.01)", 
and using (288) 
(2981) Z,=abLi(d,— A,) 0.07” (1+ 0(A,—0)). 
Further, by definition of Z, from (288), (296), and (297) 

Zy = (Ay — 0)" (Ay — Ag) L BOE *+ a (Q,.0,)" (1+ 0 (A, — 0), 
(298?) Z, =abLo, 07" (A, — A.) (1+ 0(A, —0)). 


And from the definition of Z, and (288), (296°), and (298) we have 


(298°) Z,=abL Qi" ((p— Ay) r+ (Aa — Aa) Gi +O (Ae — A). 

From (298?), (298), and (298%) we obtain 

(299) Z=0 ((A,— A) 037), 

“since the leading terms in Z,, Z, and Z, cancel out. 

By (290), (294) and (299) we have finally, since a0 by assumption, 


(300) E(A,) — E(A) =O (Az — A,) 
under the condition (289). 


“6 


The Rayleigh Quotient. VI 159 


_ 80. For A=4, A2= p= (A) =A+ (A—o)* E(d) with Ao, the condition (289) 
is obviously satisfied, and we have from (300) 


E(y) — E(A) =O((A—o)") (Ao). 
But now it follows from (287) that 
2 (P(A) = pA) + (pA) — 2)" E(p(A)) 


= 9A) + (A =o)! [1+ (A — 0)’ Ea) EA) + 0 ((A — 0)?” 
= (a) + (A—o)/ E(a) + L(A — 0)? E(a)?+.0 ((A —0)?2), 


v (p(A)) — 29(4) +A=A— pd) + (A— 0) ECA) 
+ L(A —o)** 4 E (4)? + 0((4 —o)**), 


y (p(A)) — 29 (A) +4 = L(A — 0)?* E(A)? + O((A — 9)?*), 


and, since from (272) it follows immediately that 


© be (p (A) —A)? 
(501) Mages P(P(A)) — 2y(A) +A’ 


we have, since E (A) > y+ 0 


(Nee 2 : 
os 4 Seno a eee ae =e Ae) HO). 
(302) P(A) —o = (1 al (A—o)+0((A—o)*) (Ao). 


We see that the iteration by A’ = (A) converges linearly. 


81. If L is known, at once from (302) we obtain an iteration with quadratic 
convergence defining ®, (A) by (191) as 


©, (A) =Ly(d)—(L—1)A, (A) —A=L (pd) — 4), 


and all observations made in Section 49 of Part IV remain valid. To obtain ®(A) 
in our case implies 2 Horners, just as many as were necessary in Part IV in order 
to obtain g(A), though even under these circumstances the computation by 
(5) and (4) appears to be easier to organize than the computation in Section 48 
of Part IV using two sequences of vectors. 


If Z is unknown and we wish to use a further iteration, the method used 
in this part presents definite advantages, as will be shown now. 


82. In order to obtain an iteration with quadratic convergence, we use now 
a modification of STEFFENSEN’S accelerating procedure, proposed by HousE- 
HOLDER, and form 
A p( P(A) — 9(A) P(A) 
A= 
603) vA) = 5 (BIA) +A 9A) — BM) 


4160 A. M. OstrRowskKI: 


From (303) it follows immediately that 


- (p(A)—A) (BAA) 
yey wi) — A= on A— GO) 
If we put A,= P(A) and 
(305) 4, =(1-](4—a) +9, 


in (300), we have by (302) 
fy 4140(4,—0), e—h=O((4 —9)"), 


4,— 6 


and (289) is satisfied as A->a. It follows from (300) and (305) that 
(306) E(®(A)) — E(A,) =O((A —o)?). 
Hence by (287) and (302) 
g(P(a)) = D(A) + (®(a) — 0)" E(@(2)) 
= Pi) + (4a)! (1— - + OA — 0) E() + O((A — o)*), 


ve 1 
and, since by (300), E(A,) = E(c) +0 (A—o), this is 
=O (1 


p) A — 0)" Ela) +0 ((A— 9); 
therefore in virtue of (287) and (271) 


p(P(A)) + 2 — H(A) — 9 (2) =(1— L) A 0) E@) — (A — 0 BUA) 
+0((A —o)**?), 


(307) p(®(A)) +A— BA) — pA) = — (A — 9) (y+ O(4— 9). 


From (304), (271), and (302) we have 


(Ao y (—F =a) (1400-—a) 
pi) A= (Aa) +0((A—0)2), 
— 5 (=o)! (y +020) 
(308) p(4) —6 =O((A —o)?), 


and we see that the iteration by the iterating function y(A) gives a quadratic 
convergence. 


On the other hand, the computation of w(A) requires 3 Horners as defined 
in Section 58, while the procedure described in Section 50 would require 4 Horners. 


* For the formulae (303) and (304), see the formulae (4), (6) in my second paper 
cited in Sect. 74. From the discussion in that paper it appears to follow that only 
in exceptional cases does HouSEHOLDER’s modification present any advantage over 
STEFFENSEN’S procedure. However, this discussion was based on the assumption 
that both iterations which are to be combined, require the same amount of computational 
work. But in the case considered here the two iterating functions under discussion 
imply essentially different computational difficulties, and in this case, indeed, Housr- 
HOLDER’Ss procedure is distinctly superior to the original one of STEFFENSEN. 


The Rayleigh Quotient. VI 161 


83. We consider as an example the same matrix as in Section 57, 7.¢., 


Zr 0.23 
A=|(1 1 3 
te 


with the elementary divisors (A—1)* and A—2. We choose (1,0, 0) as the 
vector 7 in (274) and obtain 


07 yh (2— A”) (4—24A- A?) (A—1)? 
(309) p (A) A (2A) (4274 PALA A)2 7 


Starting with A)= .5 we obtain by (309) 
Ay = 563934426, A, = 610239496. 


The “convergence” to 1 is obviously very slow, as was to be expected. 
If we use the STEFFENSEN accelerating formula (272), we obtain 


A = 734.863 877 
while, if we assume that the value of L=2 is known, we obtain from (191) 
A* = .963 72775. 
Starting with this value as Ag‘, using (309) twice, we obtain 
Ax = .963 94465404, 
As = .964158917305. 


The convergence to 1 again slows down quite considerably, while, if we use 
(272), from the last three values we obtain 


A¥= 98105475. 


We have for the errors of A), A’, A*, and A*’ 


ae 1— i = 268136123, 1—* — 536...., 
1—Ay 
1 — A* = 03627225, 1—A*’ = .01834525, — ings 


in agreement with (273). 
On the other hand, if we use the HOUSEHOLDER acceleration formula (303), 
starting from 4’ = /, by (309), we obtain the next value 


Ay = .746996 99292; 
and then by (303) we get 
Ay = .898 787 684, 


which is probably the best approximation to be obtained in using three Horners, 
if the value of L=2 is not to be used. 
Arch, Rational Mech. Anal., Vol. 4 14 


162 A.M. OstrowskI: 


84, We return now to the results proved in the Parts III and V in order 
to draw some conclusions from them. These conclusions will be based on two 
lemmas. 


Lemma 1. Assume p=1 and put, for an integer N= 2, 
(310) f,(%) = 2" — p(a"-* +--+ + x41). 
Then f,,(x) has a positive single root m,, satisfying 


(341) pt1—-— <p,<b+14, 


and we have 
(312) Perrot (W230 SW Pat | Ae ce) 


Further, if the maximal modulus of all roots of f,,(x), wheth are =u, 1s denoted 
by g,, we have 


(313) O=< 9, = Us P= 


85. Proof. By DerscarTEs’ Rule, /,,(%) has exactly one positive root “,. 
Putting 


E, (2) = (* — 1) 1,(4) = 2°" — (b-1) x" +, 


we verify immediately that F,(6+1)=f>0, and therefore u,<f+1. On the 
other hand, we have 


1 1 \* 
and if we prove that 
(314) np<(p+1— 2)" 
(341) will follow immediately. 


To prove (314), observe that when = 2 it becomes the inequality 2p< (? + sy 
and this is indeed true since (0 — sh >0. If we now go from ” to n-+1 in (314), 


the expression on the left is multiplied by 1+ -, while the expression to the 
right is multiplied by 


14 \", 
(dares Ws 
ree, n+1 rl { 
1 = at 1 
(e+ Te) es a cea ee ee zeta a ie 
n 


From (310) we have 


frai(*) = «xf, (x) — 2, Pnta(Un) = Untn (Un) —2 = —P, 
and (312) follows. 


The Rayleigh Quotient. VI 163 


To prove (313), put €=1/w,, and 


G15) g(a) = BE) Sgt tgp tena, Gat, 
Se ey ae he i 2”) 
ee Pees eae) 

(316) yee pit bey ay 


But here we have 


Lnapa- 1; Pig pp i Pie P=, 2p Bp 


Chu Mi ln PP Cy 
~ —— << n —— 
Cy Ln—P Bs p Co 


and for y> 4 
Coty Zi 1 — Er v2 2 4 — EY oa, G 
Cy 4—§"—” Nae 6 tee tal Cy—y ; 


since 
(1 wea in )2 | (1 non) on) (4 ae Seat) aoe Ue ng ue PSA $35 De 
Seems cere. 
Therefore 


(317) Max te aie, alder Pi 


OSvSn-2 Cy 


By a theorem of ENESTROM-KAKEYA the maximum modulus gq, of a root 
of g(x) does not exceed the expression (317), and we have g, <u, —p. Lemma 1 
is proved. 


86. Lemma 2. Consider a sequence y,(v=0,1,...) such that y,—>0o and 
(318) POM Vet teniie tO es) (v=n,n +4, ...) 
for some p=1. Then we have 
(349) lim +5 > 0, 
where mis the positive root of the polynomial (310). 


Proof. Without loss of generality we can assume that all y, (v=0, 1,...) 
are positive, since otherwise we can replace the sequence y, by y,,,, for a suit- 
able m). Put 


(320) Min (9, Vi, ---» Yn—1) =4>0, 

and consider the difference equation 

(321) 2,=)(2,1+%-et--+4,_,) (v=, 2+1,...) 
with 


(322) op Sep 2 = 6. 
41* 


164 A. M. OstrRowSKI: 


From (320) and (322) it follows easily that y,2z, (v20), and therefore (319) 
will be proved if we prove that 


lim” >0. 
(323) im fo 


Denote the ” roots of the polynomial (310) by u4,=, M2, ..., 4, Where, as was 
proved in Lemma 1, 


(324) | 0p) is Gee hac eee ents 


The general solution of the difference equation (321) can be written in the form 
2 = OW + 2, Gy Uy, 
“u=2 


where if some of the w, become multiple roots (though it can be easily proved 
that this is impossible) the corresponding constant c, must be replaced by a 
polyniomal in » of degree <n—1. We have therefore, by virtue of (324), 


A= HOP ae). 


Since the O-term tends to zero, we should have, z,—>— co or z, +0 if c, were SO. 
But from (321) and (322) it follows that z,=d for ally. Therefore we have c,>0, 
z,’—>c, > 0. Lemma 2 is proved. 


87. In the proof of the Theorems 2 and 3 in Part III, our results followed 
from the following relations. Under the conditions of Theorem 3, we have (see 
(97) and (162)) 

k 
= ‘hy —o \2 
(325) Ayia — oy RE US| (k + oo), 


pail Wenn 


where y is a certain constant +0, 0, is one of the eigenvalues of. A, different 
from o, and L is the integer defined in the formulation of condition A of Theorem 3 
and =1 under condition B of Theorem 3. Further, we have under the conditions 
of Theorem 2 (see (131)) 


k 
(326) Anyi — 9 =0[(N)** TT (A, — 0)", 
x=0 
where JN, is a certain constant. 


We shall now prove that under the conditions of Theorem 2 of Part III, 
for any «>0 and for a certain positive ‘=¢(e), we have 


1 
|4,—o| 


(327) lim Ig lg 21g3, 4,—0=O(e-t@-"), 
Indeed putting 

wee 1 
(328) 2, = Ig ee 


from (82) we have for an arbitrary integer n 


y— 2(6, Ss ey aco (vy — 00). 


The Rayleigh Quotient. VI 105 
Therefore, if we put z,-C=y, for a suitable C, we have 
Wa et Sen) 0 = HA, >..); 
but then (319) holds with  =2, and therefore by (311) 


lim 4 ge: lim = \y 
tae Gay 


If we take n> = we have, for a certain positive t=¢(¢), from a certain y=v(e) on, 


Ly . = 

cme ee (v= r(e)), 
and therefore 

(329) Apo xe OY (v= v(e)); 
but from this (327) and (328) follow immediately. 


88. In the case discussed in Part V, Theorem 2 was obtained from the follow- 
ing relation (see (256)) 


k ‘ 
(330) Ags —0 =0(c JT (2, —0)] (rstesy 
where c is a suitable positive constant. But hence it follows, putting lg al ae 
that ; 

k 
(331) (igs BEE: 


v=0 


for a suitable C. From here on we proceed as in § 87, applying Lemma 2 with 
f=1, and obtain 


(332) |A,—a|<e 72-4” (v= r(e)), 


for arbitrarily small positive ¢ and a suitable positive t=¢(e). But from this, 
under the conditions of Theorem 2 of Part V, it follows obviously that 


(333) Mia — Olen) ad 09) 


(334) lim — Iglg 5, 2 182. 


Certenago-Montagnola 
Ticino, Switzerland 


(Received June 4, 1959) 


Iterative Behandlung linearer Funktionalgleichungen 


Horst BIALY 


Vorgelegt von L. COLLATZ 


1. Problemstellung 
Im folgenden sei § ein vollstandiger, nicht notwendig separabler Hilbert- 
Raum mit dem Nullelement 0, A eine lineare beschrankte Abbildung von in 
sich, E die identische Abbildung und @ die Nullabbildung von §, das heiBt, fiir 
jedes x C § gilt Ex=x und @x=ov. R* ist die Menge der positiven reellen Zahlen ; 
4, und 7 sind reelle Parameter. A* bezeichne die zu A adjungierte Abbildung. 
A heiBe positiv (A 20), wenn A hermitesch ist (A +9, A=A*) und (Ax, x) 20 
fiir alle x CH gilt!. Ist zusatzlich 0 nicht Eigenwert von A, folgt also aus Ax=op 
stets x=0, so nennen wir A streng positiv (A4>0). & ist die abgeschlossene 
Hiille einer Menge £¢ H, A(L) die durch A erzeugte Bildmenge von &. 
Ziel der Arbeit ist die iterative Darstellung einer Losung x € § der Funktional- 
gleichung 
Ax=y (f4) 


bei Vorgabe von A und y€. Fiir spezielle A ergeben sich notwendige und 
hinreichende Konvergenzbedingungen. Selbst wenn (1.1) nicht lésbar ist, also 
y@A(H) gilt, erhalten wir eine klare Auskunft iiber das Verhalten der ange- 
gebenen Iterationsverfahren. Der lésbare und der nichtlésbare Fall lassen sich 
in folgender Minimalaufgabe zusammenfassen : 


Ls ist ein x* € H zu suchen, so daB ||A x* — y|| die Zahl J(A, y): =inf||Ax— y]| 
moglichst gut approximiert.“ xEH 


Den Ausgangspunkt unserer Untersuchungen bildet ein von FRIDMAN [J] fiir 
lineare Integralgleichungen erster Art angegebenes Iterationsverfahren. Kiirzen 
wir die reziproken Eigenwerte des Kernes L mit A;, 1=1, 2, ..., und die Integral- 


b 
transformation f L(s, t) «(t)dt mit Lx ab, so lautet der Fridmansche Satz: 
a 


Sei a) L quadratisch integrabel iiber dem Intervall [a,b] symmetrisch und 
positiv definit?, 


b) yE€L,[a, b] und Lx=y lésbar, 
c) 0<t<2/ mit J=min|/,|, 


1 Im komplexen Hilbert-Raum folgt A—A* bereits aus (A x, x) 20. 
» Das entspricht der strengen Positivitat einer Abbildung. 


Lineare Funktionalgleichungen 167 


dann konvergiert die nach der Formel 
Xn = Kya + t(y — L x, _1) (1.2) 
gebildete Folge (x,) im Mittel gegen eine Lésung der Gleichung Lx=y. 
Das Verfahren (1.2) ist nicht neu. Fiir Integralgleichungen zweiter Art 
“xKx+y=x 


mit quadratisch integrablem Kern K behandelt schon Wrarpa [2] 1930 das 
parameterabhangige Verfahren 


%, = (1—7) Jed Ate) aastcc es Ngo) x9 EL, [a, b], (4.3) 


das fiir t=1 in die klassische Iteration iibergeht. Weitere Untersuchungen 
hiertiber findet man bei BUCKNER [3], der eine Verallgemeinerung von (1.3) in 
[4] behandelt. Formal geht (1.3) durch die Substitution xK=E—L in (1.2) 
iiber. Die Konvergenzverhiltnisse miissen aber fiir (4.2) neu untersucht werden. 
Wir werden zeigen, daB sich die von FRIDMAN angegebenen hinreichenden Kon- 
vergenzbedingungen mildern lassen, so daB (1.2) auf einen gréBeren Problemkreis 
anwendbar wird. 

Fiir endliche Gleichungssysteme benutzt bereits v. Mises [5] 1929 das Itera- 
tionsverfahren (1.2). Allerdings ist dort die Festlegung des Parameters t an die 
Kenntnis der Eigenwerte gebunden. 


2. Konvergenzuntersuchung zum Verfahren (1.2) 


Der Nullraum (A) einer Abbildung A und sein orthogonales Komplement 
(A) werden erklart durch 


IA) 446 D..Ax — 0}, 
INA rx? OC (A). 
P(A) x bzw. JI(A) x ist die Projektion von x€ in (A) bzw. Mt(A). Mit der 
durch § induzierten Topologie werden §t(A) und 9(A) ebenfalls zu Hilbert- 
Raumen, und es gilt x= P(A) x+ IT(A) x fiir jedes x €. 
X,—>x bezeichne die Konvergenz der Folge (x%,) C$ gegen x€, das heiBt, 


es gilt || x,,—||—0 im Sinne der reellen Analysis. Das durch eine (nicht not- 
wendig lineare) Abbildung Q von § in sich erzeugte Iterationsverfahren 


de SOUR. G NEARS, <.! 
mit dem Anfangselement %)€ 5, kurz 


Ky = QO %y-13 xoED, (2.1) 


nennen wir konvergent, wenn die Folge (Q”%)) ( konvergiert. Die folgenden 
Eigenschaften lassen sich aus der unter [6], [7] und [8] genannten Literatur 
leicht ablesen. 
E 1) Die Gleichung (1.1) mit y€§, besitzt in (A) genau eine Lésung a 
die gleichzeitig die der Norm nach kleinste Lésung, kurz die ,, Minimallésung™ 
der Aufgabe ist. 


168 Horst BIALy: 


E 2) Fiir jedes x € § gelten die folgenden Relationen 


A(Q) = 4 A*(H) = M(A*), 
P(A) A*x=0=A P(A) x, 
IT(A)A*x=A*x, ALII(A)x=Ax. 
E 3) Jedes AO kann als streng positive Abbildung von Jt(A) in sich 
aufgefaBt werden. 


E 2) findet man in [8], S. 254, 255, wahrend E 3) eine unmittelbare Folge 
von E 2) ist. 
Hilfssatz 2.1. Es gilt 
J(A, ¥) =||P(4*) 9 (2.2) 


fiir jedes yCH. Jede Lésung x’€H der Gleichung 4 x =//(A*)y erfiillt die 
Relation 
J(A, y) =||4 2’ — yl. (2.3) 


Beweis. Nach E 2) ist Ax—JI(A*) yE€M(A*) und folglich 
||A x — y|P =|14 x —(A*) y|P + [|P(A*) 9 IP 


Wegen A()=IM(A*) ist i ||A « — IT(A*) y||=0, womit (2.2) und (2.3) gezeigt 
sind. f 


Hilfssatz 2.2. Sei A>0 und gelte 
0<1t<2|4|, (2.4) 
dann konvergiert das Iterationsverfahren 


Uy, =U,1~—TAU, 4, Ug ) (2.5) 
gegen D. 


Beweis. Sei die Projektionenschar (E£,) die durch A erzeugte Zerlegung der 
Einheit. Dann ist die Funktion 


g (A) = V(Ex M0, Ho) = || Ex wll 


stetig im Punkte A=0. Es gilt g(0)=0 und g (||A||+4) =||u || fir alle a€ Rt 
(vel. [7], 3: 189) Satz 3). 


Die Funktion /(A)=1—rA ist monoton fallend, und es gibt wegen (2.4) ein 
a€R*, so daB fiir alle 6€ R* mit 6<||A||+a) die Ungleichung 


q(6):= max (|f(A)|<1 (2.6) 


~ 6SAS||Al|-+a, 
besteht. 


Zu jedem ¢€ R* gibt es nun ein 6)€ H* mit d9< ||A||-+ a, so daB 
8 (50) = || Bs, oll <3 
ausfallt. AnschlieBend ist eine natiirliche Zahl ,) so bestimmbar, daB 


q" (do) ||%o — Es, Moll < de 


Lineare Funktionalgleichungen 169 


fir alle n> mp erfiillt ist. Fiir n>, gilt somit die folgende Abschatzung: 


||A]|+4 


mall = — 4)" al| =I] Fr cada 


69 - ||Al| +4 
<|fU—ta)"dE,u| +] f —7A"dE, uy 
0 0 
90 ||A||+a 
= J dE, U% + 4” (00) dE, Up 
0 


= ||EZ5, ol] + 9” (50) ||%o0 — Es, Moll<e,  w. z.b.w. 


Im Falle ||E—tA||<1 ist die Aussage des Hilfssatzes 2.2 trivial. Seine 
Bedeutung besteht darin, daB fiir gewisse A>0 kein Parameter t angebbar ist, 
so daB ||E —7A|| <1 gilt. So stellt zum Beispiel die unendliche Matrix 


zZ © © 
Sarl Om 22770 


mit der Nullfolge (a,) < R* eine streng positive Abbildung® beziiglich des Hilbert- 
schen Folgenraumes {) =/, mit ||A||= max (a,) dar, erfiillt also die Voraussetzungen 


des Hilfssatzes 2.2. Wie man leicht nachrechnet, gilt aber 
||E —tAl||=sup|1—ta|=1 
fiir alle (auch komplexen) Parameterwerte t. Ist die Ungleichung (2.4) erfiillt, 
gilt das Gleichheitszeichen. 
Satz 1. Sei AZO, vEH, 0<1t<2||A||+, so gelten fiir das Iterationsverfahren 


Rig ekg tt (== Ade XCD (2.7) 


die folgenden Aussagen: 

a) Ax,—II(A) y und ||A x, — y||>J(A, y); 

b) (2.7) konvergiert genau dann, wenn die Glerchung (1.1) losbar ist. In letzterem 
Falle gilt x, P(A) %)+%, wobei x die Minimallosung von (1.1) ist. 

Beweis. a) Aus (2.7) folgt 

Ax, =Ax,_,+tA[P(A) y+ H(A) y —A % 1] 

oder wegen A P(A) y=o0 
=V,_,—tAV,_, mit v,=Ax,—M(A) yEM(A). 


Uv 


n 


Nach E 3) ist A streng positiv beziiglich des Hilbert-Raumes J%(A). Die An- 
wendung des Hilfssatzes 2.2 in It(A) ergibt v,—>o, daraus folgt A x,—II(A)y 
und nach (2.2) ||4 x, — y||>|| P(A) y|| =J(4, 9). 


8 Wir wollen die Abbildung A und die diese erzeugende Matrix nicht besonders 
unterscheiden. 


170 Horst BIAty: 


b) Unter Benutzung von E 2) zerfallt (2.7) in die beiden Gleichungen 
P(A) x, = P(A) %,-1 +t P(A) y = P(A) % +t P(A) y, (2.8) 
IT (A) x, =A) %,-1— t[ ALTA) x,-1— (A) y]. (2.9) 
Sei zundichst y€A(), also P(A)y=o und IJ(A)y=y=Ax. Dann geht (2.9) 


uber in 

W, = W,-1—TAW,, mit w,=—lT(A) x,—xXEM(A). 
Eine dem Beweis zu a) analoge Anwendung des Hilfssatzes 2.2 ergibt w,—o, 
IT(A) x,—>% und wegen (2.8) 


X, = P(A) x, +I1(A) x, —> P(A) %) +%. 
Sei nun y ¢.A(). Dann ist im Falle P(A) y= wegen (2.8) die Folge (x,) sicher 


divergent. Im Falle P(A) y=o fiihrt die Annahme x,,>z€ wegen Aussage a) 
auf den Widerspruch A x, >Az=II(A) y=y also yCA(4). 


Fiir A=O ist das Konvergenzverhalten von (2.7) vollstandig geklart. Wir 
wollen hierzu noch das Folgende vermerken: 

4. Die Tatsache, ob (2.7) konvergiert oder divergiert, ist unabhangig von der 
Wahl des Anfangselementes x) € 9. 

2. Die Folge (A x,) C © ist stets konvergent. Selbst wenn die Gleichung (1.1) 
nicht lésbar ist, kann das Verfahren (2.7) zum Aufsuchen eines «* € verwandt 
werden, das bei gegebenen ¢€ %i* die Ungleichung 


||A *—y||<J(4,y) +2 
erfillt (Minimalproblem!). 

3. Die Forderung A =0 1aBt sich nicht durch die Hermitizitat von A ersetzen, 
da dann selbst im Falle y€A() nicht mehr mit Sicherheit ein konvergenz- 
erzeugender Parameter t auffindbar ist. Diesen Tatbestand kann man sich 
bereits an dem einfachen Beispiel 


§ = ©, (zweidimensionaler Euklidischer Raum), 
1 O 

As | klarmachen. 
0) 4h 


Wir werden aber im 3. Abschnitt ein Iterationsverfahren fiir hermitesche Ab- 
bildungen angeben, das durch einfache Modifikation aus (2.7) entsteht. 

4. Die Aussagen des Satzes1 bleiben erhalten, wenn man ASO (mit ent- 
sprechender Definition) voraussetzt. Man hat dann lediglich die Gleichung 
—Ax=-—y anstelle von (1.1) zu betrachten. 


3. Verallgemeinerung des Verfahrens (1.2) 
Eine naheliegende Verallgemeinerung von (2.7) stellt das Verfahren 


Ky = Hy ae Ty =A fey sen) , XE A) 


dar, wobei T eine geeignete Abbildung von § in sich bedeutet. Wir betrachten 
den bereits von QUADE [9] fiir endliche Matrizen behandelten Fall T=tA*. 


Lineare Funktionalgleichungen 171 
Satz 2. Fir das Iterationsverfahren 
hn Mya tA ty — Ax, s)s %yED (3.1) 
mit O<t<2||A||? und y CG gelten folgende Aussagen: 
a) Ax,—-IT(A*)y und ||Ax,—y||>J(A, ); 
b) (3.1) konvergiert genau dann, wenn die Gleichung 
Ae 11 (A*)\ ry (3.2) 
lésbar ist. In letzterem Falle gilt x,— P(A) x )+%, wobet £ die Minimallésung von 
fez). 2S? 
Beweis. a) Aus (3.1) folgt mit A* P(A*) y=o 
ARS A xj yet ANANTT(A*) 0 Aine 4] 


oder 


= * 
Ue TIAA Ue 


mit v, =A x, —II(A*) yEM(A*), da A(H) CM(A*). 


Ferner ist 4A*=0 und nach E3) streng positiv tiber (A A*) =IM(A*). Wegen 
||4 A*|| S||A|[? gilt 0<1t<2||A||[*S2||AA*|[4. Die Anwendung des Hilfs- 
satzes 2.2 liefert mit t(A*) statt und A A* statt A zusammen mit dem Hilfs- 
satz 2.1 die Aussage a). 


b) Aus der Konvergenz der Folge (x,) ¢ gegen ein z€ § folgt nach der bereits 
bewiesenen Aussage a) 


Ax,+>Az=II(A*)y, also II(A*) yEA(G). 
Sei nun J/(A*) y€A(), also IJ(A*) y=AX, so hat (3.1) die Gestalt 
Kee He TAX ALY oq 8), (3.3) 
Nach E 2) zerfallt (3.3) in die beiden Gleichungen 
PVA) an P(A)a P(A) ay, 
Wy, = 7A* A ws 


Wy 


mit w,—JI(A)x,—%EM (A). Da A*A streng positiv tiber Nt (A) ist, folgt nach 
entsprechender Anwendung des Hilfssatzes 2.2 wieder w,,—0, also 


ty = P(A) q+ IT(A) %q—> P(A) 19 +2. 
Im endlichdimensionalen Hilbert-Raum, wo der Fall [/(A*) y¢ A(H) nicht mdg- 


lich ist, decken sich unsere Aussagen mit denen von QuaADE. Aus dem Verfahren 
(3.1) 14Bt sich ein einfacheres Verfahren fiir hermitesche Abbildungen herleiten. 


172 Horst BIALY: 


Satz 3. Sei A hermitesch, y CH, so gelten fiir das Verfahren 


ij ty aN A ee ee gee (3.4) 
mut 
o0<t<J2 |/Al|7 (3.5) 
die folgenden Aussagen: 
a) Ax,—>II(A)y, ||A%,—y|| > J(A, 9); 
b) (3.4) Ronvergiert genau dann, wenn die Gleichung (1.1) lésbar ist. In letzterem 
Falle gilt x,—-> P(A) %)+%, wobei % die Minimallosung von (1.1) 1st. 
Beweis. Aus (3.4) folgt 


Kg ig Tk Begg (3.6) 
Das ist gerade (3.1) mit 7? anstelle von t fiir die Teilfolgen (x,,) und (%2,+4), 
y=0,1,2,.... Da (3.5) die Ungleichung 0< 1?<2 ||A||-? nach sich zieht, konnen 


die Ergebnisse von Satz 2 unmittelbar tibertragen werden. 
a) Ax,,>JT(A)y und A %2,,,;—J/(A)y liefern zusammen mit Hilfssatz 2.1 
die Aussage a) von Satz 3. 


b) (3.4) konvergiert genau dann, wenn die Teilfolgen (x2,) und (%2,.,) gegen 
das gleiche Grenzelement konvergieren. Sei zunachst J7(A)y€A() und x die 
Minimallésung der Gleichung A x= JI(A) y, so gilt nach Satz 2 


Xgy—> P(A) % +X, (3.7) 

Xoy41—> P(A) % +% = P(A) x +% +1 P(A) y. (3.8) 

Somit konvergiert (3.4) genau dann, wenn zusatzlich zu [/(A)y€A() noch 
P(A) y=op gilt. Dies bedeutet aber gerade y= JI(A) y€ A(S). 


Das im Anschlu8 an Satz 1 genannte Minimalproblem laBt sich auch mit den 
Verfahren (3.1) und (3.4) iterativ behandeln. 

Setzt man in (3.4) d= E —xK, so erhalt man ein von BUCKNER [4] auf S. 309 
behandeltes Verfahren fiir Integralgleichungen zweiter Art xKx+y=x mit im 
Grundgebiet stetigem Kern K. Die dort angegebenen hinreichenden Konvergenz- 
bedingungen 


j1—22(1— 2)" <1, ete ees 
ea eas be 


welche die Kenntnis der reziproken Eigenwerte x, von K voraussetzen, kénnen 
nun durch die einfachere Forderung 


0<1t<J2||E—xK| 


a 
die fiir 
0<1<2 (1+ |x|) 
sicher erfiillt ist, ersetzt werden. Wir erhalten dann auch in dem bei BUCKNER 


ausgeschlossenen Fall, daB x mit einem x, zusammenfallt, noch eine klare Aus- 
kunft tiber das Konvergenzverhalten des Verfahrens. 


Lineare Funktionalgleichungen 173 


Die folgende Ubersicht faBt den Inhalt der Arbeit noch einmal kurz zusammen: 


SSE 
Notwendige und 


Voraussetzungen Verfahren hinreichende Weitere 
| Konvergenz- Eigenschaft 
bedingung 
al 220) 
yeESH Xy=%y, +t (y—A %y—1)) 
: a eS) 
Ut 2A — Aaey 
A hermitesch, fe RT yy losbar | x»—yI| 
Ky *n— SAU TE (Qos lg eae) 
ye, Gath, Regt aeapee bores (CPL > J(A,y) 
O<t< 2 |[4|[7 : 
ve, Ky=%y, tt A*(y—A %,_3), Ax=II(A*)y 
O2et <= 2'||A\||\* FoeD lésbar 


4. Beispiele 

Da in der Praxis oft hermitesche Abbildungen vorkommen, bringen wir zwei 
entsprechende: Beispiele zum Verfahren (3.4). Die Konvergenzgeschwindigkeit 
dieses wie auch der anderen genannten Verfahren ist auBer von der Wahl des 
Parameters t von der Spektralverteilung der Abbildung A abhangig, wie bereits 
BoveEwic [10] erwahnt. Betrachten wir das kleinste Kreisringgebiet 7S |z|SR 
der komplexen z-Ebene, welches alle von Null verschiedenen Punkte des Spektrums 
enthalt, so ist die Konvergenz um so schlechter, je mehr das Verhaltnis 7/R von 1 
abweicht. Im Beispiel 2 ist also wegen y= 0 von vornherein keine gute Konvergenz 
zu erwarten. Da die Verfahren jedoch sehr einfach zu handhaben sind, diirfte 
ihnen auch in diesen Fallen noch praktische Bedeutung zukommen. 


Beispiel 1. Gleichungssystem 
= 3,46 + 0,46, — 0,26, = 10,0 
6,4, — 0,4€, + 6,2& = — 13,0 (4.1) 
— 0,2 + 6,2& + 2,96 = 3,5. 
(4.1) ist lésbar. Die Abschatzung der Norm der Koeffizientenmatrix A durch 
die Wurzel aus der Summe der Koeffizientenquadrate liefert ||A|| <13,4. (3.4) 
ergab mit t=0,1 und der Bezeichnung x= (&/&/&3), %, = (E(/E9"/E4”) ausgehend 
von %)»=0= (0/0/0) die folgenden Naherungen: 


x, =( 1,0000/—1,3000/ 0,3500) 
%_ =(—1,1790/ 0,9090/—0,7245) 
%, =(—1,1761/ 0,8491/—0,7516) 
X_ = (—1,2178/ 0,8964/— 0,7696) 
%, =(—1,2209/ 0,8888/—0,7765) 
%_ =(—1,2214/ 0,8904/— 0,7762) 
Xq = (—1,2221/ 0,8890/—0,7776) 
%_ =(—1,2221/ 0,8892/—0,7775) 
%y = (—1,2223/ 0,8890/—0,7778) 
X49 = (— 1,2222/. 0,8889 
(a 


10,2777) 
1,2222/ 0,8888/— 0,7777) 


256 


174 Horst BIALy: 


X49 = (—1,2223/ 0,8889/— 0,7777) 

¥1g = (—1,2223/ 0,8880/—0,7777) 

14 = %42- 
Eine bessere Approximation der Minimallésung % = (— 4g'/+ §/— ) ist bei vier- 
stelliger Rechnung nicht médglich. Es sei noch bemerkt, daB A eine singulare, 
nicht einmal positiv semidefinite Matrix (mit den Eigenwerten 0; 9; — 9,9) ist. 
Das Versagen der bekannten klassischen Iterationsmethoden ist bereits an der 
ungiinstigen Hauptdiagonale zu ersehen. 

Verwenden wir in (4.1) anstelle der rechten Seite y=(+10,0/— 13,0/+3,5) 

den Vektor y+Ay=(+10/—10/0), so erhalten wir ein nichtlésbares System. 


Wegen 
: 42—G+4y)I 84% — yl] +1149 
muB es zu jedem ¢€ i* ein x* geben, so daB 
|A <* — (y+ Ay)||Se4+||4y||=e+46... (4.2) 


gilt. Mit dem Anfangselement %)»=o0 ergibt die iterative Behandlung der Minimal- 
aufgabe 


x, =(  1,0000/—1,0000/ 0,0000) 


( 
%» =(—0,9800/ 0,6800/—0,6400) 
X%, =(—0,7612/ 0,7312/—0,8956) 
%q =(—41,0165/ 0,6595/— 0,6868) 
%5 =(—0,7979/ 0,7623/—0,9168) 
%_ =(—1,0204/ 0,6527/—0,6941) 
%_ =(—0,7989/ 0,7622/— 0,9179) 
%, =(—1,0211/ 0,6513/—0,6955) 
%y =(—0,7990/ 0,7620/— 0,9180) 
Xy9 = (—1,0213/ 0,6510/— 0,6958) 
% = (—0,7991/ 0,7621/— 0,9181) 
Xo = (—1,0213/ 0,6510/—0,6959) 


“13 = *1- 
Die Konvergenzgeschwindigkeit ist die gleiche wie im lésbaren Fall. Sowohl 
A%p» als auch A xj stellen eine gleichgute Approximation 


der rechten Seite y+ Ay dar mit 
[Am — (y + Ay)|| = |]4 3 — (y+ 4y)|]| =3,3...- 


(4.2) ist also noch unterboten. Gema8 (3.7) und (3.8) muB x43 — X49 eine Nahe- 
rungslésung der homogenen Gleichung A x= sein. Hier gilt sogar exakt A (%13—%49) 
=(0/0/0)=o. Die der Norm nach kleinste Lésung der Minimalaufgabe wird 
am besten durch x,, angenahert, wie man an den Zahlenwerten erkennt. 


Beisprel 2. Integralgleichung erster Art 
1 
S| Bea GaSe — _ 
a | « (é) Seas Chee VE Citeaa (4.3) 


Die durch den Kern L(s, t)=|s—t| vermittelte Abbildung A des Hilbert-Raumes 
L,[0, 1] in sich ist hermitesch, aber nicht positiv. Denn sie besitzt einen positiven 


Lineare Funktionalgleichungen 175 
und unendlich viele negative Eigenwerte, wie CoLLatz [11] zeigt. Wegen 
LoL 
|A|? SS f|s—tPdsdt=2 
00 


kann (3.4) mit t= 2 verwendet werden. Ausgehend von x)=0 sind die Naherungs- 
lésungen Polynome, deren Berechnung wegen 


1 & Qsnr2 
n+2 n+1 (n-+1) (w+2) 


Z 
J |s—t)fdt= 
0 


sehr einfach ist. Die graphische Darstellung einiger Naherungen vermittelt einen 
Eindruck von der Konvergenz des Verfahrens. Die exakte Lésung x= 1-++ 6s? ist 
mit eingezeichnet. 


aaa con 


0 05 10 


ce 


Fig. 1 


Da sich die Eigenwerte im Nullpunkt haufen, ist die Konvergenzgeschwindig- 
keit nicht sehr groB. Demgegeniiber steht die Tatsache, da die iterative Be- 
handlung der Gleichung (4.3) mit den bisher bekannten Iterationsverfahren nicht 
moglich ist. Die Fehler der eingezeichneten Naherungen lauten 


|x — % || = 3,493 
\|~ — +, || = 2,600 
||” — x49|| = 0,727 


||~ — 29|| = 0,495 
|x — 0] = 0,459 
I|¥ = %4o|| = 0,150. 


176 


[9] 
[10] 


[11] 


Horst Brary: Lineare Funktionalgleichungen 


Literatur 


Fripman, V. M.: Eine Methode der sukzessiven Approximation fiir Fredholm- 
sche Integralgleichungen erster Art. Uspehi Mat. Nauk (N.S.) 11, Nr. 1 
(67), 233 —234 (1956) [russisch]. 

WiarpA, G.: Integralgleichungen unter besonderer Beriicksichtigung der An- 
wendungen. Leipzig u. Berlin: B. G. Teubner 1930. 

Buckner, H.: A special method of successive approximations for Fredholm 
integral equations. Duke math. J. 15, 197—206 (1948). 

Bucxner, H.: Ein unbeschrankt anwendbares Iterationsverfahren fiir Fredholm- 
sche Integralgleichungen. Math. Nachr., Berlin 2, 304—313 (1949). 

Mises, R. v., u. H. PoLLaczEK-GEIRINGER: Praktische Verfahren der Gleichungs- 
auflésung. Z. angew. Math. Mech. 9, 58—77 (1929). 

Riesz, F., u. B. Sz.-Nacy: Vorlesungen iiber Funktionalanalysis. Berlin: VEB 
Deutscher Verlag der Wissenschaften 1956. 

LyustTEerRnIk, L. A., W. J. Sopotew: Elemente der Funktionalanalysis. Berlin: 
Akademie-Verlag 1955. 

ZAANEN, A. C.: Linear Analysis. Amsterdam-Groningen: North-Holland Pub- 
lishing Co. 1953. 

QuabE, W.: Auflosung linearer Gleichungen durch Matrizeniteration. Bericht 
Mathemat.-Tagung Tiibingen 1946, 123—124. 

Bovewica, E.: Bericht tiber die verschiedenen Methoden zur Lésung eines Systems 
linearer Gleichungen mit reellen Koeffizienten, III. Proc. Akad. Wet. Amster- 
dam 50, 1285—1295 (1947). 

Cotratz, L.: Schrittweise Naherungen bei Integralgleichungen und Eigenwert- 
schranken. Math. Z. Berlin 46, 692—708 (1946). 


Dresden A 20 
BrunnenstraBe 11 


(Eingegangen am 17. Juli 1959) 


Anwendung von Fixpunktsatzen bei der 
numerischen Behandlung nichtlinearer Gleichungen 
in halbgeordneten Raumen 


JOHANN SCHRODER 


Vorgelegt von L. COLLATZ 


Es werden Schranken fiir Lésungen u* von linearen oder nichtlinearen Glei- 

chungen der Form 

6 Au= Bu (0.1) 

in einem reellen halbgeordneten linearen Raum 3 hergeleitet. Aw sei von mono- 

toner Art im Sinne der Definition von L. CoLiatz [2]. Bu soll von zwei Funktio- 

nen H, und H, mit gewissen Monotonieeigenschaften eingeschlossen werden. 
Die Aufgabe (0.1) lasse sich in bestimmter Weise auf die Gestalt 


u=Tu (0.2) 


bringen. Es ist oft zweckmaBig, von einer Gleichung der allgemeineren Art (0.1) 
statt direkt von (0.2) auszugehen. Andererseits ist (0.2) in (0.1) als Spezialfall 
enthalten. 

Die meisten Fixpunktsatze (FPSe) behandeln Gleichungen der Form (0.2) 
und geben Voraussetzungen fiir den Operator T und fiir Teilmengen §% seines Defi- 
nitionsbereiches an, unter denen folgende Aussage gilt: Bildet T eine Menge MM 
in sich ab, so existiert ein Fixpunkt u*=Tu* EM. Solcher Art sind z.B. der 
FPS fiir kontrahierende Abbildungen in der von J. WEISSINGER [12] angegebenen 
Form, der Brouwersche FPS [1], [5], die von J. SCHAUDER [6] bewiesenen FPSe 
fiir vollstetige und schwachstetige Abbildungen in einem Banach-Raum und der 
FPS von TycHonorF [11]. 

Der FPS fiir kontrahierende Abbildungen wurde bereits vielfach bei der 
numerischen Lésung von Gleichungen (0.2) benutzt, besonders in der von 
L. CoLLatz [3] angegebenen Gestalt. Die anderen FPSe verwendete man bisher 
jedoch hauptsachlich nur, um die Existenz von Lésungen zu beweisen. Auch 
diese FPSe sagen aber nicht nur aus, daB eine Loésung existiert, sondern ferner, 
daB u* EC TMC Me ist. 

Wir konstruieren hier nun Mengen J mit T tc M solcher Art, daB die Aus- 
sage u* C TMC M eine numerisch brauchbare Abschatzung der Lésung darstellt. 
Wir verwenden dabei Mengen I= <x, y>, welche aus der Gesamtheit der wER 
mit x<u<y bei gegebenen x, y bestehen. Von dem Operator 7 wird verlangt, 
daB aus T <x, y> C<x, y> die Existenz eines Fixpunktes u* €<¢x, y> und damit 


Arch. Rational Mech. Anal., Vol. 4 42 


178 JOHANN SCHRODER: 


einer Lésung der Gleichung (0.1) folgt. Um in einem konkreten Fall nachzu- 
priifen, ob T diese Eigenschaft hat, wird man die in der Literatur bewiesenen 
FPSe heranziehen. 

Satz 1 in Nr. 2.1 gibt allgemeine Bedingungen fiir die Elemente x, y an, unter 
denen 74%, VC <%.y> 1st. 

In Nr. 2.2 wird fiir den Sonderfall (0.2) der Gleichung (0.1) ein Iterations- 
verfahren fiir zwei Folgen {x,} und {y,} untersucht. Satz 2 sagt aus, da diese 
Folgen unter bestimmten Bedingungen eine Losung einschlieBen. Dies Ergebnis 
enthalt als Spezialfall die in [4] fiir das Iterationsverfahren wu, ,,= Tu, mit 
monoton wachsendem oder fallendem Operator T bewiesenen EinschlieBungs- 
aussagen. Jedoch ist Satz 2 auf viel allgemeinere Falle anwendbar. Statt vom 
Operator T zu fordern, da8B er selbst monoton sei, gentigt es hier vorauszusetzen, 
daB T monoton zerlegbar, d.h. als Differenz zweier monoton wachsender Opera- 
toren darstellbar sei. Zum Beispiel ist ein Matrizenoperator JT nur in Spezial- 
fallen monoton, er l4Bt sich aber immer monoton zerlegen (s. Beispiel 1 in Nr. 3.1, 
BQ): 

In Nr. 2.3 wird das Iterationsverfahren u,,,=Tu, behandelt. Satz 3 gibt, 
ausgehend von Schranken fiir u,—,, Abschatzungen des Fehlers u* — u, an. 
Wendet man Satz 3 auf die in [7] untersuchten Differentialgleichungen an, so 
kommt man zu ahnlichen Ergebnissen wie dort. Im Falle einer als Beispiel in 
Nr. 3.3 behandelten einfachen Randwertaufgabe zeigt es sich jedoch, daB die 
hier gewonnenen Abschatzungen genauer sind und schwachere Voraussetzungen 
benétigen. Die in [7] bewiesenen Ergebnisse haben demgegeniiber den Vorzug, 
auch Konvergenzaussagen zu enthalten. 

Der Satz 4 in Nr. 2.4 bringt Abschatzungen des Fehlers u* — wu, einer auf 
irgendeine Weise berechneten Naherung wm, fiir eine Lésung w* der allgemeinen Auf- 
gabe (0.1). Diese Abschatzungen gehen von Schranken fiir den Defekt —A u,+Buy 
aus. Die in [9] bewiesenen Ergebnisse fiir Differentialgleichungen stehen mit 
den hier gewonnenen in ahnlichem Zusammenhang wie die in [7] hergeleiteten 
mit den Aussagen des Satzes 3. 

Die Bemerkungen zu den Anwendungen in Abschnitt 3 dienen im wesentlichen 
nur zur Erlauterung der abstrakten Satze. Numerische Ergebnisse sollen in 
einer folgenden Arbeit verdffentlicht werden. Einfache numerische Beispiele 
findet man aber auch in [8] und [10]. 

[8] enthalt ahnliche, mehr ins Einzelne gehende Untersuchungen fiir den 
Spezialfall linearer Gleichungssysteme. In [10] werden Fixpunktsitze in etwas 
einfacherer Weise bei der Lésung von Randwertaufgaben angewendet. 


1. Bezeichnungen und Aufgabenstellung 
1.1. Bezeichnungen 
Es sei #t ein reeller halbgeordneter linearer Raum von Elementen Un seca 
In & seien also Addition w+-v und Multiplikation «w mit reellen Zahlen «& erklart, 
und fiir gewisse Elementepaare w, v sei die Bezichung wv definiert. Ferner sei 
in ®t ein Grenzwertbegriff lim u,—w erklart, d.h. es sei festgelegt, wann eine 


Folge {w,} (w=1, 2,...) (gegen ein Element u) konvergent heiBen soll. Dabei 
mogen die tiblichen Rechenregeln gelten. 


Numerische Lésung nichtlinearer Gleichungen 179 


Zum Beispiel kann # ein Banachscher Raum mit der Norm ||w|| sein, in 
welchem zusadtzlich eine Ordnungsbeziehung wv definiert ist. limuw,—w be- 
deutet dabei lim ||~—w,||=0. Zu den tiblichen Rechenregeln gehért u.a., daB 
lim u,,20 (= Nullelement) gilt, falls {w,} eine konvergente Folge mit u,=0 ist. 


Sind x, y zwei Elemente € {it mit x< y und bedeutet ® eine Teilmenge von Ni, 
SO sel (%, Y>y die Menge der wEN mit xSuSy: 


Cx, Yn = {UEN: xu yh. 
Ferner wird 
K— ©, Wu = {HEM uy}, <x, dn = {WMEN: x Sul, (— 00, 00)_ =MN (1.1) 


definiert. Im Falle )t= % lassen wir den Index weg: 


Ch; YR = CX; y> 
S sei ein Raum gleicher Art wie N. 
Ein Operator M, welcher eine Teilmenge J von Wt in © abbildet, heiBe: 
positiv, falls Mu=0 ist fiir w=0 (wEM), 
monoton wachsend, falls Mu Mv ist fiir wSv (u, ve M), 
monoton fallend, falls Mu= Mv ist fiir wSv (u, ve M), 
monoton zerlegbar, wenn M sich als Differenz 

M=P—Q 


zweier monoton wachsender Operatoren P und (Q darstellen ]aBt. 


1.2. Aufgabenstellung 
Gegeben sei die Gleichung 
Au=Bu (4.2) 


mit (nicht notwendig linearen) Operatoren A und B. A bilde eine lineare Mannig- 
faltigkeit Uc R und B eine Menge <¢—y, p> in S ab (p= — ~w, p= ~ zugelassen) : 


A NOS; = BLO; py eS. 


Der Operator A sei von monotoner Art im Sinne der Definition von L. CoL- 
ey 7alical eral ee 

aus AuSAv folge usv_ fiir uve. (1.3) 

Es gebe zwei Funktionen H,[&, 7] und H,[&, 7] mit folgenden Eigenschaften: 


Die Funktionen 4, [&, 1] sind erklart fiir Elemente &, 7, die in einer Menge 
<@®, Y» liegen, und haben Werte aus G (O@= — oo, Y= oo zugelassen). 


Es gilt: 
H,[u, uj) < Bus H,[u,u] fir uc <y,po<@, PD, (1.4) 
H,[é,, m1) = H,[é2, no) fir &Sé2, m= nal. €<P,P>) =1,2). C1. 
Speziell kann hie Le also 
ee Bu=H{u, u| 
12* 


180 JOHANN SCHRODER: 


1.3. Umformen der Aufgabe in u=Tu 

Wir wollen die gegebene Gleichung (1.2) in die Gestalt w= Tu umformen. 
Im Falle X=R= GS, A=! (Einheitsoperator) ist (1.2) bereits von dieser Art, 
man kann T=B setzen. Viele Probleme wird man auf diese spezielle Gestalt 
bringen, bevor man die hier hergeleiteten funktionalanalytischen Ergebnisse 
darauf anwendet. Oft ist es jedoch auch zweckmaBig, von der allgemeineren 
Gleichung (1.2) auszugehen und 7 dann wie nun beschrieben ,,implizit™ zu defi- 
nieren. Dazu setzen wir weiter das folgende voraus. 

Es gebe eine lineare Mannigfaltigkeit © mit UCCCR derart, daB zu jedem 
u€ <p, p>¢ ein vEA existiert, welches der Gleichung 


Av=Bu 
geniigt. Wegen (1.3) ist dies v eindeutig bestimmt. Durch 
v= SH 


ist dann ein Operator S definiert, der (~, y>¢ in % abbildet. 

Im Falle C= setzen wir T=S. Das ist also z.B. méglich, falls Y= R=S 
und A=I ist. Jedoch ist die obige Voraussetzung nicht allgemein fiir C= 
erfiillt, z.B. vielfach nicht fiir Randwertaufgaben bei partiellen Differential- 
gleichungen (s. dazu [/0]). 

Im Falle © + werde angenommen, da S sich zu einem Operator T fort- 
setzen laBt, der (py, y> in © abbildet. Dann ist das gegebene Problem der Aufgabe 
aquivalent, eine Lésung u* € (gy, p> der Gleichung 


hy, == IP 
zu ermitteln. 

Denn sei Au= Bu, soist uc (gy, p> 0 U= CY, Wa C KY, Y¢ und somit w= Su= 
Tu. Gilt andererseits w= Tu, so hat man u€ <y, p) \C = CQ, dg, alsou=Su, 
dh. Au=—ASu=— Bu. 

Es werde ferner vorausgesetzt, daB diese Fortsetzung von S auf T so geschehe, 
daB folgendes gilt. Seiw <p, p>, aber nicht € ©, und x irgendein Element €<, p> 
mit 7<4w, so existiere eine Folge {w,}€ <p, y>¢ mit xSu,<u und lim Su,=Tu. 


n= 


1.4. Der geforderte Fixpunktsatz 
Von dem in Nr. 1.3 definierten Operator T setzen wir nun weiterhin voraus, 
daB fiir ihn ein Fixpunktsatz folgender Art gilt: 
Bildet T eine Menge <x, y> C <q, y> mit x<y, x+— c, yy + co in sich ab, 
d.h. gilt 
Px, Y> CX%, YD, 
so existiert eine Lésung 
u*®* — Tu*€ <x, y>. 


Um in einem konkreten Fall festzustellen, ob T diese Eigenschaft hat, wird 
man priifen, ob 7’ die Voraussetzungen eines der bekannten Fixpunktsatze erfiillt. 
J. SCHAUDER [6] bewies z.B. folgenden Fixpunktsatz: Bildet ein Operator T eine 
konvexe, abgeschlossene Teilmenge St eines Banachschen Raumes stetig in sich 
ab und ist 79 kompakt, so enthalt I einen Fixpunkt u*= Ty*., 


Numerische Lésung nichtlinearer Gleichungen 181 


Die Menge I= <x, y> ist konvex und abgeschlossen. Ist also # ein halb- 
geordneter Banachscher Raum und kann man zeigen, daB (bei beliebigen 
x,VECQ,y> mit *< y) T auf <x, y> stetig und T<x,y> kompakt ist, so weiB 
man nach dem Schauderschen Fixpunktsatz, da8 T die hier in Nr. 1.4 verlangte 
Eigenschaft besitzt. Dabei wird eine Menge kompakt genannt, wenn jede ihrer 
unendlichen Teilmengen ein (in 3 gelegenes) Haufungselement besitzt. 


2. Existenzaussagen und Fehlerabschatzungen 
2.1. Allgemeine EinschlieBungsaussagen 
Wir beweisen zunadchst einen sehr allgemeinen EinschlieBungssatz, auf den 
die Aussagen der folgenden Abschnitte zuriickgefiihrt werden. 


Satz 1. Es gebe zwei Elemente x, yE Cp, Wy 0D, YY mit 


x Sy, (2.1) 
AxSH[x,y], H[y,x]SAy. (2.2) 
Dann besitzt die gegebene Aufgabe Au=Bu eine Lésung u*, ftir welche 
Ms US ay (2.3) 
und 
AX =e 29 (ae UF Sy, 4s Ay (2.4) 
gilt. 
Beweis. Fiir u€ <x, y>, d.-h. x Susy, ist infolge (1.5) 
Hy [x,y] S H,[u, wu] < Bu <H,[u, u) < Hyly, *] (2.5) 
und damit nach (2.2) 
Avs BusAy (2.6) 


Im Falle w€€ hat man Bu=ASuw. Auf Grund der Monotonieeigenschaft 
(1.3) von A ergibt (2.6) daher x Susy, d.h. wegen Su=Tu 


Nes fay (257) 


Ist w nicht aus € ©, so existiert nach Voraussetzung (vgl. Nr. 1.3) eine Folge 
{u,} EC (p, pg mit xSu,<u und limSu,=—Tu. Fir jedes wu, gilt nach obigem 
x<Su,<y. Beim Grenziibergang 1—> oo erhalt man auch fiir dieses w wieder 
(ORVAR 

T bildet also <x, y> in sich ab. Daher existiert nach der in Nr. 1.4 genannten 
Voraussetzung eine Lésung u* € (x, y>, fiir welche infolge (2.5) auch (2.4) gilt. 


2.2. EinschlieBen der Losung durch Naherungen eines Iterationsverfahrens 


In dieser und der folgenden Nummer gehen wir von Gleichungen der Form 


w= TU (2.8) 


aus, behandeln also den Spezialfall Y= R= GS A=I, B=T der Autgabe (1.2). 
Nach den in Nr. 1.2 genannten Voraussetzungen laBt sich Tu—= Bu in der Form 


H,{u,u)S Tus A, [u, 4) (2.9) 


182 JOHANN SCHRODER: 


abschatzen, wobei (1.5) gilt, speziell kann 
Tu=H{u,u] (2.10) 
sein. Wir untersuchen das Iterationsverfahren 
5a = Hy | Kas Voli See = Balan eal (= O45 25a, -) (2.11) 
Satz 2. Es gebe zwei Elemente x9, Yo9E <v, py O<®, PY, fiir welche 


KS, MSH, USM (2.12) 
gilt. px 
Dann ist die Iteration (2.11) unbeschrankt durchfiihrbar, und es existiert evne 
Lésung u* = T u* mit 
Np eae Ss BO Se ISIS SV: (2.13) 


Beweis. Mit x=%9, Y= Yo sind infolge (2.12) die Voraussetzungen (2.1), (2.2) 
des Satzes1 (bei U=R= CS, A=J) erfiillt. Nach Satz 1 existiert daher eine 
Lésung u*, fiir welche (2.4), also wegen Bu* = Tu*=u* 


XSHySuUSYyS Vo 


ist. (2.13) folgt hieraus mit Hilfe vollstandiger Induktion. Sei namlich 


Kp Xp SUS SIV 521 


fiir ein P=1, so gilt 
Xp ap 1L%p—1> ¥,1lS M[%,, Vp = Mets 
X pty a A, [Xp, Vol == HH, Poe, u* | = ie is 
u* = Tu* = A, |u*, u*| = As[ yy, Xp] = Vp4, 


Vp A, [Yp> % p | SH, eta %»—1| pe 
Gigh. 


hp SX p41 SU* 2545 SV 


Ist JT monoton zerlegbar, d.h. 
T= P—Q_ mit monoton wachsenden Operatoren P und Q, 


so hat Tw die Form (2.10) mit 
[E,n] = P&— Qn. 


Hangt speziell Qu nicht von wu ab, ist Qu also ein festes Element €, so ist T 
monoton wachsend, hangt Pw nicht von w ab, so fallt T monoton. Die Aussagen 
des Satzes 2 sind in diesen Fallen mit den in [4] formulierten identisch. Bei 
monoton fallendem Operator wurden in [4] nur etwas andere Bezeichnungen 
verwendet, z.B. wurde die Folge {x,} dort x9, y,, %2, Vg, ... genannt. 

Man kann (2.11) auch als Iterationsverfahren mit monoton wachsendem 
Operator deuten. Es sei ? der Raum der Elementepaare (w, v) (wE®, ve MR) 
mit folgender Ordnungsdefinition: 


(u,v) <(&,y) bedeutet Us Eo Vieni, (2.14) 


i sei der durch ~ 
(4,2) = (H, [u,v], H,[v, u)) 


Numerische Lésung nichtlinearer Gleichungen 183 


erklarte Operator. Dieser Operator ist monoton wachsend im Sinne der Defini- 
tion (2.14). (2.41) 1aBt sich dann in der Form 


n~ 


(%n41> Vn-+1) a EA» Vn) (2 = On Aree. .) 
schreiben, und (2.13) ist gleichbedeutend mit 


(%o, Yo) S (%, V1) S (%2, Vo) So S (u*, w*). 
2.3. Fehlerabschdtzungen fiir das Iterationsverfahren u,,.,= Tu, 
Wie in Nr. 2.2 gehen wir von einer Gleichung 


ley 


aus, setzen also U= R= 6S, A=I, B=T voraus. Wir leiten jetzt aus-Satz 1 
Fehlerabschatzungen fiir das Iterationsverfahren 


Uy we IL UH, (2 = 90,2) (2.15) 


her, Abschatzungen des Fehlers u* — u,,,,, welche von Schranken fiir die letzte 
Anderung u,.,—w, ausgehen. Da man die beiden jeweils zuletzt nach (2.15) 
berechneten Elemente immer als nullte und erste Naherung auffassen kann, 
beschranken wir uns auf Abschatzungen, welche nur diese Naherungen wu, und 
enthalten. 


Ist uC <p, p>) 9<@, YY, so definieren wir? 


K,[8,9] = Alu + &, uo +n] — Buy (¢=1,2), (2.16) 


wobei in diesem Abschnitt 2.3 also Buy= Tuy=u, gilt. 


Satz 3. Es sei u, ein Element €<y,p>o<¢®,P> und u4=T uy, und es gebe 
zwet Elemente v, w ER mit folgenden Eigenschaften?: 


vSw, (27) 
@—vsusVP—w, (2.18) 
yg — K,[v, vw) S4,sy—4K,[w,v], (2.19) 
v—K,[v,w|)S4,-—4 Sw —K,[w,v]. (2.20) 

Dann existiert eine Lésung u*=T u*, fiir welche 
Ky |v, vw] Su* —u,S K,[v, v] (2.21) 

gilt. 
Beweis. Wir setzen 

x=H,[uytv,u+w), yv=HA,[u+u,uy+). (2:22) 


1 Gewohnlich wird Bu j=H,[u%, u]=Heluy, uo] sein. Das gilt z.B. fiir die in 
Nr. 3.3 hergeleiteten Funktionen H; im Falle §=w. 

2Ist eine der vorkommenden GrédSen g, ® und y, Y gleich —oo bzw. co im 
Sinne der Definition (1.1), so hat man im folgenden die zugehdrige Ungleichung 
wegzulassen. Zum Beispiel lautet (2.18) im Falle B= —oo, P+o00:u)S V—w. 


184 JOHANN SCHRODER: 


Diese GroBen sind wegen (2.17) und (2.18) erklart. (2.19) ist dann gleichbedeutend 
mit 
psx, ySy, 


und (2.20) geht tiber in 
UtvsS%, ySuwt+w. (2.23) 


Wegen (1.5), (2.17), (2.18) und (2.23) hat man 
SxSysy. 
Aus (2.23) folgt mit (1.5) weiter 


x = Hy |u + 2, uy + wv] SA,[x, y], 
y= Hy [u +4, Uy + v] 2 A,[y, x]. 


Fiir die Elemente (2.22) sind also alle Voraussetzungen des Satzes 1 erfillt. 
Demnach existiert eine Lésung u*= T u*, fiir welche (2.3) gilt. Diese Beziehung 
ist mit (2.21) identisch. 


Erlduterungen. Wir besprechen die Wahl der Elemente ¢, y, ® und ¥. Die 
Bedingung (2.18) sichert, daB die im Beweis vorkommenden Argumente der 
Funktionen H, und H, in <@, Y) liegen. Die Forderung (2.19) hat zur Folge, 
daB die durch (2.22) definierten Elemente x, y aus <q@, p> sind. 

Damit die Ungleichungen (2.18) und (2.19) erfiillt sind, ist es also zweck- 
maBig, g und ® moglichst ,,klein“ und y und Y moglichst ,,groB" zu wahlen. 
Verwendet man sogar p= P= — oo, y= Y= ov, so fallen die Forderungen (2.18), 
(2.19) ganz weg?. 

Wenn man jedoch versucht, Zu nach (2.9) zwischen sehr einfache gebaute 
Funktionen H, und H, einzuschlieBen, werden diese Funktionen (also z.B. die 
darin eventuell vorkommenden Parameter) vielfach vom Definitionsbereich <q@, p> 
des Operators T abhangen und dann um so giinstiger ausfallen, je groBer man p 
und je kleiner man y wahlt. In diesem Falle hat die Voraussetzung (2.19) Be- 
deutung. Auch dann kann man aber fast immer @=— w, Y= oo verwenden, 
so daB (2.18) wegfallt. 

Die Forderung (2.19) entspricht der von L. CoLLarz [3] fiir den Fixpunktsatz 
fiir kontrahierende Abbildungen formulierten Voraussetzung, daB eine gewisse 
Kugel mit dem Mittelpunkt “, im Definitionsbereich von T enthalten sein soll. 


2.4. Vom Defekt ausgehende Abschatzungen 


Wir betrachten jetzt den allgemeinen Fall der Aufgabe (1.2). Man kann dafiir 
die Ergebnisse des Abschnittes 2.3 benutzen, indem man sie auf den in Nr. 1.3 
konstruierten Operator 7 anwendet. Zur Fehlerabschatzung sind dann also zwei 
aufeinanderfolgende Naherungen u) und u;= Tu zu ermitteln, Elemente w) und 
u, also, welche durch die Gleichung 4 u,= Bu, verkniipft sind, falls 1) €€ ist. 
In vielen Fallen ist es bei geringem oder doch ertraglichem Rechenaufwand 
méglich, ein solches Elementepaar zu berechnen (s. z.B. [3], [7]), in anderen 
kann es sehr schwierig oder sogar unméglich sein. Wir leiten daher aus Satz 4 
jetzt Fehlerabschatzungen her, fiir welche man nur eine Naherung u)€% be- 
notigt und welche von Schranken fiir den Defekt — Au + Buy anstelle der 
Anderung #,— tg ausgehen. A, und Ky seien durch (2.16) erklart. 


Numerische Lésung nichtlinearer Gleichungen 185 


Satz 4. Es sei u, ein Element € <9, py <P, YY und 
d[u|=—Aut+Bu (2.24) 


der Defekt der Gleichung Au=Bu beziiglich up. Es gebe zwei Elemente v, we X 
mit folgenden Eigenschaften?: 


vw, (2725) 

®—vsu,sV—w, (2.26) 

Q=—VS%=y— wv, (2527) 

A(u% + v) — Au — K,[v,w] Sd[u]S A(u+w)— Au — K,[w,v]. (2.28) 


Dann besitzt die gegebene Aufgabe (1.2) eine Lésung u* © mit 


VS ut — mh Sw. 
Bewets. Wir setzen 
X=Uy+v, Y=Hu+w. 
Wegen (2.25), (2.26) und (2.27) hat man x, yvE<g, pyar ®, Y>, so daB unter 
anderem K,[v,w] und K,[w,v] erklart sind. (2.25) bedeutet x<y. Aus (2.28) 
folgt mit (1.5) 
Ax =A(u,t+v) = K,[v, vw] + Buy = A, [uy +9, %+ 0] =f, [x,y] 
Ay=A(u,+w) = K,[wv,v]) + Bu = A,[u +, wu +0] = Aly, x]. 
Fiir die oben definierten Elemente x, y sind also die Voraussetzungen des 
Satzes 1 erfiillt. Daraus folgt mit Satz 1 die Behauptung. 


(2.29) 


Im allgemeinen wird der Operator A linear sein. Dann geht (2.28) tiber in 
Av—K,|v, 0] Sd[u] < Aw — K,[v, r}. 


Die Voraussetzungen (2.26) und (2.27) haben ahnliche Bedeutung wie die Voraus- 
setzungen (2.18) und (2.19) in Satz 3 (s. die dortigen Erlauterungen). 


3. Zur Anwendung der Ergebnisse 

In dieser Nummer zeigen wir an einigen einfachen Beispielen, wie die in den 
Nummern 1 und 2 (insbesondere 2.2 bis 2.4) vorkommenden abstrakten GréBen 
im konkreten Fall aussehen kénnen. Dieser Abschnitt hat lediglich den Zweck, 
zum Verstandnis der vorangegangenen Ausfiihrungen beizutragen und einen 
Eindruck zu vermitteln, auf Probleme welcher Art sich die abstrakten Ergebnisse 
anwenden lassen. Eine ausfiithrliche Darstellung der Anwendungen mit numeri- 
schen Beispielen soll spater folgen. 


3.1. Die Beispiele 


Beispiel i. Es ist ein lineares Gleichungssystem 
u=Mu-+s (3.4) 


mit einer gegebenen px/p-Matrix M=(m;,) und gegebenem #-dimensionalen 
Vektor s=(s,) zu lésen. 


186 JOHANN SCHRODER: 


Beispiel 2. Gesucht ist eine auf dem Intervall [0,1] stetige Funktion «(¢), 
welche der Integralgleichung 


1 
a(t) = f Git, s) f(s, a(s) ds (3.2) 
0 
gentigt. Dabei sei 
AC fir. 0Stase4 
Gis) =| hoe ; (3.3) 
Sl sau). Tie OSs 7S 4 


und. /(t,.2) eine fir 05751, + co<2< oo-erklarte stetige Funktion,_ (3-2) ist 
der Randwertaufgabe 
— FE =IGa), (0) =a(1)=0 (3.4) 
aquivalent. 
Beispiel 8. Gesucht ist ein Paar «(¢), B(t) auf [0,1] stetiger Funktionen, 


welche dem Gleichungssystem 
1 1 
= [Glts)F(s,a(s),B(s)) ds, po=(3 < Gt) #(s,«(s),B(s)) ds (3.5) 
0 
gentigen. Dabei sei G(t,s) durch (3.3) definiert und f(t, z,, 2.) fir OS¢S<1, 
— co<z;< oo erklart und stetig. Dies System ist der Randwertaufgabe 
mn da =: = 
— Sa =tlie FE), (0) =a(1) =0 (3.6) 
aquivalent. 
Beispiel 4. Gesucht ist eine auf dem Intervall [0, 1] zweimal stetig differen- 
zierbare Funktion « (¢), welche der Randwertaufgabe 


ae 
= =f(4,4), a (0) = f.(«(0)), a(1) =0 (3.7) 
geniigt. 7, (#, 2) sei fiir O<t<1, — 00<2z< ov, fy(z) fiir — 0<z< 00 erklart und 
stetig. 
3.2. Zur Anwendung des Satzes 2 
Wir zeigen, daB sich jede der Aufgaben 1 bis 3 unter bestimmten Voraus- 
setzungen als Gleichung (2.8) 
u=TIu (3.8) 
mit einem monoton zerlegbaren Operator T schreiben l4Bt, welcher einen halb- 


geordneten Banach-Raum in sich abbildet. Wir verwenden dabei also <O.> = 
(00, 0) = i. 

Beispiel 1. Es sei 8 der Banachsche Raum der p-dimensionalen Vektoren 
u=(u') mit der Norm [||] = max | «| und #Sv durch u'<v’ (i=14, 2,..., p) 
erklart. (3.1) hat die Form (3.8) mit 

Tu=Mu-s. 
T 1aBt sich darstellen als 
Test On 
wobei P und Q durch 
Pu=M*tu+s, Qu=M-u 


Numerische Lésung nichtlinearer Gleichungen 187 


mit den Matrizen 


Mt ( | mix te mk Ma ( | m2; x | ~ Mik 
— 5 : 5 } 


definiert sind. P und_Q sind monoton wachsend. 


Beispiel 2. % sei der Banachsche Raum der auf [0, 1] stetigen Funktionen 
u(t) mit der Norm ||w|| =max|u(Z)|, uv durch u(t) <v(t) fir OS¢S<1 erklart. 


Die Aufgabe (3.2) hat die Form (3.8) mit «(¢)=«(¢) und 
at 
Tu =f G(t,s) f(s, u(s)) ds. 
0 


Wir setzen zusatzlich voraus, daB f(t, z) sich als Differenz® 


f(t,2)=p(2) — 92) (3.9) 


zweier stetiger Funktionen # (t, z) und q(t, z) darstellen lasse, welche in z monoton 
wachsen : 
PURE) Pee elt 2) Sg aes AUT SZ (3.10) 
Dann ist 
P= Pe 0 
mit den durch 


RAR Se oe OPENS ieaayas (3.11) 


definierten Operatoren P und Q, welche wegen G(t,s) 20 und (3.10) monoton 
wachsen. 


Existiert a (¢, z) und ist diese Ableitung stetig, so erhalt man bei beliebiger 
stetiger Funktion #(¢) z.B 
c 
0 
H,0) = 169) + [FE (be) de= 668) — 966.8) 
o) 
mit 


roars fe Lal de, a=f[Lto] ae. 6.12) 


Dabei wird folgende Definition benutzt. Ist g(¢;) eine reellwertige Funktion 
von # Veranderlichen ¢;, so bedeutet 


ay {8 fir g@) 20 eM apel Oma destin heirs) 298 
W) =| O Peiur Fe) SO" et) baie fir g(t; SO. 6-13) 
Damit gilt dann 
g(t) =e") —¢), |e t)| =8*G) +2). (3.14) 


3 Jede Funktion beschrankter Variation lat sich als Differenz monotoner Funk- 
tionen darstellen. Jedoch sind die dafiir in der Theorie benutzten monotonen Funk- 
tionen fiir die Anwendungen nicht gut geeignet. » und q sollten hier in z méglichst 
schwach wachsen. 


188 JOHANN SCHRODER: 


a(t) 
Beispiel 3. Nt sei der Banachsche Raum der Vektoren u(t) = ia deren 
Koordinaten «, B auf [0, 1] stetige Funktionen sind, 


|| «|| = max (max | « (Z)|, max |B (2)|)- 


u<fi bedeute «(t) <&(t), B)<A(b) fir OStS1. Die Aufgabe (3.5) hat die 
Form (3.8) mit 


Wir nehmen der Einfachheit halber zusatzlich an, es sei 


we 


fiz 2) =(Gaea) trey ase) eres 


Dann ist 


T= Peso 
mit 4 
J G(é,s) f(s, «(s), B(s)) ds ) 
Le “i 7304 : 
Sirs) sto) 18 (oie J si (siets}B{s)) ds 


und beide Operatoren P und Q sind monoton wachsend. 


3.3. Zur Anwendung des Satzes 3 
Wir behandeln hier nur das Beispiel 2, welches in Nr. 3.2 bereits funktional- 
analytisch formuliert wurde. 
a) Zunachst leiten wir eine fiir ¢,7¢€<(®,W> =<«— co, co) = definierte 
Funktion H[€,, 7] her, mit der (2.10) gilt. Dabei setzen wir wieder voraus, daB 
f(t, 2) sich in der Form (3.9) darstellen l4Bt. Man kénnte dann 


H[é,n] = PE—-Qyn (3.15) 


mit den durch (3.11) definierten Operatoren P und Q verwenden. Wir geben 
hier jedoch eine Funktion H an, welche fiir die Abschatzungen nach Satz 3 
giinstiger ist. Wir zeigen dann auBerdem, wie einige der in Satz 3 vorkommenden 
Gr6Ben mit dieser Funktion aussehen. 

Y(t) bedeute im folgenden eine fest gewahlte stetige Funktion. Im allgemeinen 
ist es zweckmaBig, als 9 (#) eine Naherung fiir die gesuchte Losung zu verwenden. 
Am einfachsten werden die Ergebnisse mit @ (#) = uy (t). 


Aus (3.9) folgt bei beliebiger stetiger Funktion 1 (¢) 


f(t, u) =p (t,9 + (w—9)) —q(t,0 + (u—9)) 
=1f(t,9) + [2,0 + (w—9)*) — p(t,d)] + [¢(t,9) —¢ (4.0 -— (w—9)) 
~— [96,9 + (w—9)*) —9(t, 9)| — [p9) —p(69 —w—8))] 
=f (t,8) + max {p (1,9 + (w—9)*) — p(t, 9), q(t, 8) — g(t, — (w—8)} 
— max {q(t,0 + (u — 9)*) — 4 (t,8), p89) — p(t,8 —(w—9))}. 


Numerische Lésung nichtlinearer Gleichungen 189 


Es gilt also 
Tu=H [u; | 
mit 


H[E,n] (¢) =f t, s) h[E,n] (s) ds (3.16) 


und 
ALE, n] (t)=f (4,9) +max{p (¢, 8+ (E—8)*) —p (,9), (6, 8) 9 (¢,0— (8) )} 
—max {9(t,9-+(7—8)*) 9,9), P(t, 9) PL 9—(E-8))}. 


Dabei ist die Bezeichnung (3.13) benutzt. 


Diese Funktion H [&,7]| hat die Monotonieeigenschaft (1.5), denn es ist G (¢, s) 20 
und die Beziehung wv ist mit (4—#)*<(v—¥V)*, (4—V) 2 (v—V)~ identisch. 
Ersetzt man die in (3.17) vorkommenden Maxima durch die Summe der Glieder, 
deren Maximum zu nehmen ist, so ergibt sich die Funktion H in (3.15). Ent- 
sprechend waren dann auch bei den unten aus H hergeleiteten GréBen K und k 
die Maxima durch Summen zu ersetzen. An den Formeln (3.20) z.B. erkennt 
man, daB die Ergebnisse dadurch erheblich verschlechtert wiirden. 


Ist — (t, z) stetig und verwendet man (3.12), so ergibt sich 


(§—8)* (n—9)~ 


hlEn] O=K demas} f 24 Gato)] de, | Exe oof ae} 


0 
(n—8)* (eo o)- 


sr boto] de, { [ze Go a) ep 


Fiir 8=u, haben die in (2.16) erklarten Funktionen A, und kK, dann die 
Gestalt 


KE] @) = KLE) (d =1G(ts) R[E,n|(s)¢s (t= 1,2) (3-18) 
mit 
SE ~ 
alé age (to+0)| de, f |ZE m—0)) ae 
Pee lies (3.19) 
— max f BE (¢, Uy + 0), do, f Ea (2, % — 0), ae}. 

0 0 

Benutzt man in Satz3 w = —v=0,s0 benétigt man dort die GréBen A[—vw, w] 


und K[w, —w]. Die nach (3.18) zugehérigen k ergeben sich aus (3.19) zu 
Ww “A Ww 3 es 
k[— w, w] = — max | | EB (é, Uy + 0), do, x oe (6; Uy — 0)) ae| ? 
: ? (3.20) 


WwW W 


k[w, —w] = max} f [zd (t+ 0)} 40, f | Ze Cb to — al dep. 
0 


0 


190 JOHANN SCHRODER: 


b) Als nachstes leiten wir zwei verschiedene Funktionen H, und H, einfacherer 


: ; a bed! : 
Gestalt her, mit denen (2.9) gilt. Dazu setzen wir voraus, et existiere und sei 
stetig und es gebe fiir 0<¢<1, OSz< co erklarte stetige Funktionen /;,(é, 2) 


; ef; 
(i, k=1, 2) mit stetiger, positiver partieller Ableitung nach z (Ze (Z, )20) 
derart, daB gilt: 


fin(0)=0 (4,4 =1,2) (3.21) 
of i Ofty as or i f 
fa H+) me eek i ) eae, fiir p=0. (3.22) 


Zeetol seo), [Leo—of s Ste 


Dann kann man abschatzen: 

hE, m) ) = ft, 0) — max ffhrlt, (9 —9)*) hel E-9) J} =MlE MIO, 

nlE,n] OQ Sit, 8) + max {fart (E — 9)"), holt — 9))} = ALE) O. 
Fiir die mit diesen Funktionen h; und hj statt # nach (3.16) gebildeten Funktionen 
Hy und Hy, gilt (2.9). Diese H; [£,7] sind ebenfalls fiir E, 7 € (<®, W) =<— 00, ~ =H 
erklart. 

Die vorkommenden Ausdriicke vereinfachen sich weiter, wenn f,;=/,2 und 

ie Pp ist oder sogar alle vier Funktionen /;, gleich einer Funktion /(x, z) sind. 
Im letzten Falle kann man (3.21) und die Ungleichungen (3.22) wegen (3.14) 


durch die aquivalenten Forderungen 


~ 


jéo=0, [Fe wotrels shel) tir —o<g<o (3.23) 


ersetzen, und man erhalt dann 
hy [E,4] =f (t, 0) — f(t, max {(n — 8)*, (€ — 8)9) 
hg [é,n] = f(t, 9) +f (¢, max {(é — 9)*, (yn —8)>). 


Fiir die entsprechenden GréBen K,[—w,w] und K,[w, —w] ergeben sich 
daraus im Falle )=w, die besonders einfachen Ausdriicke 


—K,[—v,v]= few s) f(s, w(s)) ds = K,[w, —w]. 


Das fiir —v=w 20 mit diesen GréBen K, [—w, w] und K,[w, — w] formulierte 
(durch eine Abschatzung der Art (3.23) gewonnene) Ergebnis des Satzes 3 ent- 
spricht Aussagen, welche im Zusammenhang mit Iterationsverfahren Uw = LU, 
in [7] bewiesen wurden. Jedoch wurde dort die zusatzliche Voraussetzung be- 


a (¢, 2) in z monoton wachst. 
c) AbschlieBend leiten wir zwei einfache Funktionen H, und Heeher ate 


welche man die Ableitung fe nicht benédtigt. Die damit nach Satz 3 gewonnenen 
Ergebnisse sind [ialts ae existiert] oft sogar genauer als die mit den obigen 


Funktionen AH; und Hy; erhaltenen. 


nétigt, daB 


Numerische Lésung nichtlinearer Gleichungen 191 


Es gebe eine fiir 0S?<1, 0<z< oo erklarte stetige, in z monoton wachsende 
Funktion /(¢, z) mit folgenden Eigenschaften: 


#0) =0, |fmte)—fm)|Sfblel) fir —o<g<o. 


Dann kann man abschatzen: 


T(t, 4) =f(t, %) + [Ff (t, Uy + (u — Uo)) — f(t, M) ] 


< f(t, uo) + f(t, |w — ul) 
=u) + f(t, (4 — Uo)* + (u — %)7) 
=f (t,u>) + f(t, max {(u — u9)*, (u — Uy) *}) 


und entsprechend 
HG u) 2 f(t, U) — f(t, max {u =U) (a Uo) }) . 
(2.9) gilt also mit 
A,[&, 1] (t) HG t s)A [En] (s)4s  (¢= 4,2) 
und a 
hy [&,] =f (t, %) — i(, max {(n — U)*, (E — Up) t) , 
hy [&, 4] = f(t, Uo) + f(t, max {(é — Up)*, (n — Uo) }). 


Damit ergeben sich fiir w20 die Groen 
1 ~ 
— K,[—w, vw] =f G(s) f(s, w(s)) ds = K,[w, — wv]. 
0 


Auch das fiir —v=w mit diesen GroBen K, und kK, formulierte Ergebnis 
des Satzes 3 1aBt sich aus [7] herleiten, wiederum aber nur unter einer zusatzlichen 


~ ~ 


Voraussetzung, namlich der, daB die Differenz /(¢,z-+ 2’) —f(é, z) fiir z, 2’ 20 mit 
z und 2’ monoton wachst. 


3.4. Zur Anwendung des Satzes 4 


Wir zeigen hier nur, wie sich das System (3.7) von Gleichungen als Aufgabe 
der Art (1.2) schreiben laBt. 
Es sei St der bereits in Beispiel 2 benutzte Banachsche Raum der auf [0, 1] 


u(?) 
stetigen Funktionen und © die Menge der Vektoren v= mit w(t) € # und 


B 
reellen Zahlen 6 und y. In © definieren wir: y 
v<o bedeutet u(t)<a() far OX#S1, BSB, y=, (3.24) 


wobei also bei den y das Gleichheitszeichen steht. 
Ferner sei 2{ die Menge der Funktionen w€ }i mit stetiger zweiter Ableitung, 


<p, yoy = und au Ai) 
dt? ag 

Au=){ (0) ( Bu=s fy (u(0)) 
u (4) 


Dann ist die Aufgabe Au= Bu dem gegebenen Problem (3.7) aquivalent, «=u 
gesetzt. 


# a 
192 Jouann ScHRODER: Numerische Losung nichtlinearer Gleichungen 
- ” w 


A ist von monotoner Art, denn aus Au =O folgt wu 20, d.h. aus 


_ uso fir 0<1S1, u(0)Z0, u(1) 0 folgt u(d) 20 fir OS#<1. (3.25) 


Der Ausdruck Bw laBt sich in ahnlicher Weise abschatzen wie Tw in Nr. 3.3. 
Der zu dieser Aufgabe gehérende Operator T hat die Gestalt 


(Su =) Tu =fylu(o)) t+ f GUs) Als, w(s)) ds 
mit der Funktion (3.3). 


Man kann in der Definition (3.24) auch yS~y setzen, da (3.25) auch mit 
u(1) 0 statt w(1)=0 gilt. Es sollte an diesem Beispiel jedoch einmal gezeigt 
werden, daB man auch eine ,,monotone Art“ solch schwacherer Form benutzen 
kann. 


Literatur 


[1] Brouwer, L. E.J.: Uber die Abbildung von Mannigfaltigkeiten. Math. Ann. 
71, 97—115 (1912). 

[2] Corratz, L.: Aufgaben monotoner Art. Arch. Math. 3, 366—376 (1952). 

[3] Cotratz, L.: Numerische Behandlung von Differentialgleichungen, 2. Aufl. 
Berlin-Géttingen-Heidelberg: Springer 1955. 

[4] Cotratz, L., u. J. ScHRODER: EinschlieBen der Losungen von Randwertaufgaben. 
Numa Vath toot 7.2 (A050). 

[5] Kwaster, B., C. Kuratowski u. S. MAzuRKIEWIcz: Ein Beweis des Fixpunkt- 
satzes fiir m-dimensionale Simplexe. Fund. Math. 14, 132—137 (1929). 

[6] ScHauDER, J.: Der Fixpunktsatz in Funktionalradumen. Studia Math. 2, 

171:— 182 (1930). 
SCHRODER, J.: Fehlerabschatzungen bei gewohnlichen und partiellen Differen- 
tialgleichungen. Arch. Rational Mech. Anal. 2, 376—392 (1958). 

[8] ScHROpDER, J.: Fehlerabschatzungen bei linearen Gleichungssystemen mit dem 

Brouwerschen Fixpunktsatz. Arch. Rational Mech. Anal. 3, 28—44 (1959). 

[9] ScHRODER, J.: Vom Defekt ausgehende Fehlerabschatzungen bei Differential- 

gleichungen. Arch. Rational Mech. Anal. 3, 219—228 (1959). 

[10] Scurép_Er, J.: Error Estimates for Boundary Value Problems using Fixed-point 

Theorems. Proc. Symp. Madison Wisc. 1959. 

[11] Tycuonorr, A.: Ein Fixpunktsatz. Math. Ann. 111, 767—776 (1935). 

[12] WEIssINGER, J.: Zur Theorie und Anwendung des Iterationsverfahrens. Math. 

Nachr. 8, 193—212 (1952). 


ira 


os 


Institut fiir angewandte Mathematik 
Universitat Hamburg 


(Eingegangen am 27. Juli 1959) 


* 


R. BERKER 
Technical University 
Istanbul 


L. CESARI 
Research Institute for Advanced Study 
Baltimore, Maryland 


Ee COLEALZ 
Institut fir Angewandte Mathematik 
Universitat Hamburg 


A. ERDELYI 
California Institute of Technology 
Pasadena, California 


J. L. ERICKSEN 
The Johns Hopkins University 
Baltimore, Maryland 


G. FICHERA 
Mathematics Research Center 
U.S. Army 
University of Wisconsin 
Madison, Wisconsin 


R. FINN 
Stanford University 
California 


HILDA GEIRINGER 
Harvard University 
Cambridge, Massachusetts 


H. GORTLER 
Institut fiir Angewandte Mathematik 
Universitat Freiburg i. Br. 


D. GRAFFI 
Istituto Matematico,, Salvatore Pincherle‘‘ 
Universita di Bologna 


A. E. GREEN 
King’s College 
Newcastle-upon-Tyne 


J. HADAMARD 
Institut de France 
Paris 


L. HORMANDER 
Department of Mathematics 
University of Stockholm 


M. KAC 
Cornell University 
Ithaca, New York 


E. LEIMANIS 
University of British Columbia 
Vancouver 


A. LICHNEROWICZ 
Collége de France 
Paris 


CrEsLin 
Massachusetts Institute of Technology 
Cambridge, Massachusetts 


EDITORIAL BOARD 


W. MAGNUS 
Institute of Mathematical Sciences 
New York University 
New York City 


G. C. MceVITTIE 
University of Illinois Observatory 
Urbana, Illinois 


J. MEIXNER 
Institut fiir Theoretische Physik 
Technische Hochschule Aachen 


C. MIRANDA 
Istituto di Matematica 
Universita di Napoli 


C. B. MORREY 
University of California 
Berkeley, California 


C. MULLER 
Mathematisches Institut 
Technische Hochschule Aachen 


W. NOLL 
Carnegie Institute of Technology 
Pittsburgh, Pennsylvania 


A. OSTROWSKI 
Mathematics Research Center 
U.S. Army 
University of Wisconsin 
Madison, Wisconsin 


Recs RIVELIN 
Division of Applied Mathematics 
Brown University 
Providence, Rhode Island 


M. M. SCHIFFER 
Stanford University 
California 


J. SERRIN 
Institute of Technology 
University of Minnesota 
Minneapolis, Minnesota 


E. STERNBERG 
Division of Applied Mathematics 
Brown University 
Providence, Rhode Island 


R. TIMMAN 
Instituut voor Toegepaste Wiskunde 
Technische Hogeschool, Delft 


R. A. TOUPIN 
Naval Research Laboratory 
Washington 25, D.C. 


GTRUBSDELTE 
801 North College Avenue 
Bloomington, Indiana 


He VIVA 
47, bd. A. Blanqui 
Paris XIII 


ED IN viAW 


CONTENTS 


CoLEMAN, B. D., & W. Nott, On the Thermostatics of Continuous Media 


Pipkin, A.C., & R. S. Riviin, The Formulation of Constitutive Equa- 
tions in Continuum Physics. I 
MorGENSTERN, D., Herleitung der Plattentheorie aus der dreidimensio- 


nalen Elastizitatstheorie . 


OsrrowskI, A. M., On the Convergence of the Rayleigh Quotient Itera- 
tion for the Computation of Characteristic Roots and Vectors. VI 


(Usual Raleigh Quotient for Nonlinear Elementary Divisors) . 
Biaty, H., Iterative Behandlung linearer Funktionalgleichungen . 


SCHRODER, J., Anwendung von Fixpunktsatzen bei der numerischen 


Behandlung nichtlinearer Gleichungen in halbgeordneten Raumen 


Druck der Universitiitsdruckerei H. Stiirtz AG., Wiirzburg 
Printed in Germany 


il 


129 


145 


153 


166 


177 


tmertnting, eo trcepeteqeys, 


